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OPTIMIZATION OF PEST RESISTANCE GENES 
USING DN A SHUFFLING 

CROSS-REFERENCE TO RELATED APPLICATIONS 

— ^ — -.-_This-applicationxlaims4he-benefit-GtOW074^ — ^ 

(converted to provisional application Ser. No. 60/122,054), and provisional application 
60/094,462, filed July 28, 1998. 

FIELD OF THE INVTNTION 

This invention pertains to the field of development of optimized genes thaf 
can render plants resistant to insects, nematodes, fungi, and other pests. 

BACKGROUND OF THE INVENTION 

Genes coding for proteins vyith insecucidai activities. are currently used in 
agriculture to control specific pests (Asgrow Reports - Genetic Engineering for Pest Control 
- Len Copping, Chapters 2. 1-2.4). For example, genes coding for Bacillus thuringiemis (Bt) 
crystal proteins have been incorporated stably in several crops and are widely used as insect 
control digQXiis {Pest, Sci. (1998) 52: 165-175, Asgrow Reports, yupra.). Several other 
examples of different genes coding for insecticidal activity are also known (Asgrow Reports, ' 
supra.). However, the greatest limitation to using many of these genes is lack of sufficient 
activity (potency) and/or lack of useful spectrum of activity. For example, even the most 
widely used-family of genes coding of crystal proteins-are limited -with respeet^^^^^^ 
they control and potency vs. various economically important pests (Asgrow Reports, supra,). 
For example, Bt toxins are weak versus com root worms and other coleopteran pests. 

Thus, a need exists for toxins that exhibit improved properties against various 
plant pests, and for methods of obtaining such toxins. Surprisingly, the present invention 
provides a strategy for solving each of the problems outlined above, as well as providing a * 
variety of other features which will become apparent upon complete review of the following 
material. 
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SUMIVIARY OF THE INVENTION 

The invention provides methods of obtaining an optimized, recombinant pest 
resistance gene which can confer resistance to a pest upon a plant in which the gene is 
expressed. The methods involve (1) recombining a plurality of forms of a nucleic acid 
which comprise segments derived from a gene which can confer upon a plant resistance to a 
pest, wherein the plurality of forms of the nucleic acid differ from* each other in two or more 
nucleotides, to produce a library of recombinant pest resistance genes; and (2) screening the 
library to identify at least one optimized recombinant pest resistance gene that exhibits 
improved pest resistance capability compared to a non-recombinant pest resistance gene. 

In some embodiments, the methods also involve (3) recombining at least one 
optimized recombinant pest resistance gene with a further form of the pest resistance gene, 
, which is the same or different from, one or more of the plurality of nucleic acid forms of (1), 
to produce.a further libraiy of recombinant pest resistance genes, (4)- screening the flirther 
library to identify at least one flinher optimized recombinant pest resistance gene that 
exhibits a further improvement in pest resistance capability compared to a non-recombinant 
pest resistance gene; and (5) repeating (3) and (4), as necessary, until the further optimized 
recombinant vector module that exhibits a fiirther improvement in pest resistance capabiHty 
compared to a non-recombinant pest resistance gene. ^ ' ■* 

The- invention also provides libraries that contain a plurality of recombinant 
pest resistance genes,' wherein each recombinant pest resistance gene contains, different 
permutations of segments of a gene which can confer upon a plant resistance to the pest. 

BRIEF DESCRIPTION OF THE DRAWFNGS 

Figure 1 shows a scheme for hi vitro shuffling, "recursive sequence 
recombination," of genes. , ' 

' \ Figuris 2- shows a dendogram of 5ac77///5 //7?/;7>7g7'6';?^^^^^^^ 

Figure 3 shows a dendogratn of a greater number of Bt toxin genes. ' 

Figure 4 presents a dendogram that shows the. similarity among various types 
of Cryl, Cry3,.Cry7, Cry8, Cry 14, and CrylS toxins. 

Figure 5 shows a schematic of a method for using A. rhizogenes to insert a 
shuffled toxin gene into hairy roots, which are then screened for the presence of toxin 
activity against a pest of interest. 
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DEFINITIONS 

The term "screening" describes what is, in general, a two-step process in 
which one first determines which cells do' and do , not express a screening marker and then, 
physically separates the cells having the desired property.. Selection is a form of screening in 
which identification and physical separation are achieved simultaneously by expression of a 
selection marker, which, in some genetic circumstances, allows cells expressing the marker 
to survive while other celiFciielor vice versa)~Screening markers inclu3e^luciferase7beta- 
galactosidase, and green fluorescent protein.. Selection markers include drug and toxin 
resistance genes. Although spontaneous selection can and does occur in the course of 
natural evolution, in the present methods selection is performed by man 

A ''exogenous DNA segment," '^heterologous sequence" or a "heterologous 
nucleic acid," as used herein, is one that originates from a source foreign to the particular 
host cell, or, if from the same source, is modified from hs original form. Thus, a 
heterologous gene in'a host cell includes a gene that is endogenous to the particular host cell, 
but has been modified. Modification of a heterologous sequence in the . applications 
described herein typically occurs through the use of DNA shuffling. Thus, the terms refer to 
a DNA segment which is foreign or heterologous to the cell, or homologous to the cell but in 
a position within the host cell nucleic acid in which the element is not ordinarily found. . 
Exogenous .DNA segments are expressed to yield exogenous polypeptides, - 

The term ''gene" is used broadly to refer to any segment of DNA associated 
with a biological function. Thus, genes include coding sequences and/or the regulator>| . 
sequences required for their expression. Genes also include nonexpressed DNA segments 
that, for example, form recognition sequences for other proteins. Genes can be obtained from 
a variety -of sources, -including cloning from a source, of inteirest . or synthesizing, fro rn known , 
or predicted sequence information, and may include sequences designed to have desirefl 
parameters. .\ ' " , 

By ''an insecticidally effective part" of the a pest resistance gene is meant a 
DNA sequence encoding a polypeptide which has fewer amino acids than the respective full- 
length polypeptide encoded by the pest resistance gene, but which is still toxic to the. target- 
pest. , ' . 

. the term "isolated," when applied to a nucleic acid or protein, denotes that 
the nucleic acid or protein is essenUally free of other cellular components \yith which it is 
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4 . • ' 

associated in the natural state. . It is preferably in a homogeneous state although it can, be in 
either a dry or aqueous solution. Purity and homogeneity are typically determined using 
analytical chemistry techniques such as polyacr>'lamide gel electrophoresis or high 
performance liquid chromatography. A protein which is the predoinina'nt* species present in 
5 a preparation is substantially purified. In particular, an isolated gene is separated from open . 
reading frames which flank the gene and encode a protein other than the gene of interest. 
The term "purified" denotes that a nucleic kcid or protein gives rise to essentially one band 
in an electrophoretic gel. Particularly, it means that the nucleic acid or protein is at least 
about 50% pure, more preferably at least about 85% pure, and most preferably at least about 
10 99% pure. 

The term ''naturally-occurring" is used 16 describe an object that can be found 
in nature as distinct from being artificially produced by man. For example, a polypeptide or 
polynucleotide sequence that is present in an organism (including viruses) that can be 
isolated from a source in nature and which, has not been intentionaliy modified by man in the 
15 laboratory is naturally-occurring. 

The term ''nucleic acid" refers to deoxyribonucleotides or ribonucleotides and 
■ polymers thereof in either single- or doiible-stranded form. Unless specifically limited, the 
term encompasses nucleic acids containing known analogues of natural nucleotides which 
have similar binding properties as the reference nucleic acid and are metabolized in a manner 
20 • similar to naturally occurring nijcleotides. Unless otherwise indicated, a particular nucleic 
; acid sequence also implicitly encompasses conser\'atively modified variants thereof (e,^. 
degenerate codon substitutions) and complementarv' sequences and as well as the sequence 
explicitly indicated. Specifically, degenerate codon substitutions may be achieved by 
generating sequences in which the third position of one or more selected (or all) codons is 
25 , substituted with mixed-base and/or deoxyinosine residues (Batzer e( al. (1991) Nucleic Acid 
Res, 19: 5081; Ohtsuka et 'al. (1985) J. BioL Chew, 260: 2605-2608, Cassol et al. (1992) , 
Rossolini a/. (\994) MoL Cell. Probes S: 91-98). The term nucleic acid is used 
interchangeably with gene, cDNA, and mRNA encoded by a gene. 

'"Nucleic acid derived from a.gene" refers to a nucleic acid for whose 
30 synthesis the gene, or a subsequence thereof, has ultimately ser\'ed as a template. Thus, an 
mRNA, a cDNA reverse transcribed from an mRNA, an KNA transcribed from that cDNA, a 
DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, e/c, are all 
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derived from the gene and detection of such derived products is indicative of the presence 
and/or abundance of the original gene and/or gene transcript in a sample. 

A nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For instance, a promoter or enhancer is 
operably linked to a coding sequence if it increases the transcription of the coding sequence. 
Operabl y linked me ans that the DNA sequences..beingJinked-are-typicall^y^ — 
where necessary to join two protein coding regions, contiguous and in reading frame. 
However, since enhancers generally function when separated from the promoter by several 
kilobases and intronic sequences may be. of variable lengths, sonie polynucleotide elements 
may be operably linked but not contiguous. 

A specific binding affinity between two molecules, for example, a ligand and 
a receptor, means a preferential binding of one molecule for another in a mixture of 
molecules. The binding of the molecules can be considered specific if thcr binding affmity is 
about 1 X 10"* M to about 1x10^ M or greater. 

The term "recombinant" when used with reference to a cell indicates that the 
cell replicates a heterologous nucleic acid, or expresses a peptide or protein encoded by a 
heterologous nucleic acid. Recombinant cells can contain genes that are not found within 
the native (non-recombinant) form of the cell. Recombinant cells can also contain genes 
found in the native form of the cell wherein the genes are modified and re-introduced into 
the cell by artificial means. The term also encompasses cells that contain a nucleic acid 
endogenous to the cell that has been modified v/ithout removing the nucleic acid from the 
cell; such modifications include those obtained by gene replacement, site-specific mutation, 
and related techniques. 

A "recombinant' expression cassette" or simply an "expression cassette" is a 
nucleic acid construct, generated recombinantly or synthetically, with nucleic acid elements 
that are capable of effecting expression of a structural gene in hosts compatible with such 
sequences. Expression cassettes include at least promoters and optionally, transcription 
termination signals. Typically, the recombinant expression cassette includes a nucleic acid 
to be transcribed (e.^., a nucleic acid encoding a desired polypeptide), and a promoter. 
Additional factors necessary or helpful in effecting expression may also be used as described 
herein. For example, an expression cassette can also include nucleotide sequences that 
encode a signal sequence that directs secretion of an expressed protein from the host cell. 



wo 99/57128 ■ ^ ■ , PCT/US99/08473 

Transcription termination signals, enhancers, and other nucleic acid sequences that influence 
gene expression, can also be included in an expression cassette. ' ' 

■ The terms "identical" or percent "identity," in the context of two or more 
nucleic acid or polypeptide sequences, refer to two or more sequences or subsequences that 
5 are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same, when compared and aligned for maximum correspondence, as measured using one of 
the following sequence comparison algorithms or by visual inspection. ■ 

The phrase "substantially identical," in the context of two nucleic acids or 
. polypeptides, refers to two or more sequences or subsequences that have at least 60%, 
10 preferably 80%, most preferably 90-95%> nucleotide or amino acid residue identity, when 

compared and aligned for maximum correspondence, as measured using one of the following 
sequence comparison algorithms or by visual mspection. Preferably, the substantial identity 
' 'exists over a region of the sequences that is at least about 50 residues in length, more 

preferably over a region of at least about 100 residues, and most preferably the sequences are 
15 : substantially identical over at least about 1 50 residues. In a most preferred embodiment, the 
• , sequences are substantially identical over the entire length of the coding regions. 

For sequence comparison,, typically one sequence acts as a reference sequence 
to which test sequences are compared. When using a sequence comparison algorithm, test 
. ■ and reference sequences are input into a .computer,- subsequence coordinates are designated, 
20 if necessary, and'sequence algorithm program parameters are designated. . The sequence 
' comparison algorithm then calculates the percent sequence identity for the test sequence(s) 
relative to the reference sequence, based on the designated program parameters. 

Optimal alignment of sequences for comparison can be conducted, e.g., by 
' the local homology algorithm of Smith & Waterman, 2:482 (1981), by the 

25 ■' homology alignment algorithm of Needlemari & Wunsch, / Mol. Biol. 48:443 (1970), by the 
■ search for similarity method of Pearson &Xipnian, Prac A^^^ 85:2444 
(1988), by computerized implementations of these algorithms (GAP^ BESTFIT, FAST A, 
and TEASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 
Science Dr., Madison, WI), or by visual inspection {see generally Ausubel et aL, infra), 
30 , One example ofalgorithm that is suitable for determining percent sequence 

identity and sequence similarity is the BLAST algorithm, which is described in Altschul et 
aL, J. Mol. Biol. 215:403-410 (1990). Software for performing BLAST analyses is publicly 
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available through the National Center for Biotechnology Information 
(http://ww\v.ncbi. nlm.nih.gov/). This algorithm involves first identifying high scoring 
sequence pairs (HSPs) by identifying short words of length W in the query sequence, which 
either match or satisfy some positive-valued threshold score T when aligned with a word of 
5 the same length in a database sequence. T is referred to as the neighborhood word score 
™thresholdXAltschul"era/.7-^7//;ra)~Thes^^^^^ 

initiating searches to find longer HSPs containing them. The word hits are then extended in 
both directions along each sequence for as far as the cumulative alignment score can be 
increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters 
1 0 M (reward score for a pair of matching residues; always > 0) and N (penalty score for 

mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to 
calculate the cumulative score. Extension of the word hits in each direction are halted when: 
the cumulative alignment score falls off by the quantity X from its maximum achieved • 
value; the cumulative score goes to zero or below, due to the accumulation of one or more 
15 negative-scoring residue ahgnments; or the end of either sequence is reached. The BLAST 
algorithm parameters VV, T, and X determine the sensitivity and speed of the alignment. The 
BLASTN program (for nucleotide sequences) uses as defaults a wordlength (VV) of 1 1, an 
expectation (E) of 10, a cutoffof 100, M=5, N=-4:and a comparison of both strands. For 
amino acid sequences, the BLAST? program uses as defauhs a wordlength (W) of 3, an 
20 expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) 
Proc. Nati Acad. Sci. {75^ 89: 10915). • 

In addition to calculating percent sequence identity, the BLAST algorithm 

also performs a statistical analysis of the similarity .between two sequences .(^ee, Karlin 

& Ahschul (1993) Proc. Nat'L Acad Scl USA 90:5873-5787). One measure of similarity 
25 provided by the BLAST algorithm is the smallest sum probability (P(N))^ which provides an 
indication of the probability by which a match between two nucleotide or amino acid 
sequences would occur by chance. For example, a nucleic acid is considered similar to a 
reference sequence if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and 
30 most preferably less than about 0.001. 

Ajiother indication that two nucleic acid sequences are substantially identical 
is that the two molecules hybridize to each other under stringent conditions. The phrase 
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''hybridizing specifically to,'', refers to the binding, duplexing, or hybridizing of a molecule 
only to a particular nucleotide sequence under stringent conditions when that sequence is 
present in a complex mixture (i?.^.; total cellular) DNA or RNA. ''Bihd(s), substantially" 
refers to complementary hybridization between a.probe. nucleic acid and a target nucleic acid 
5 , and embraces minor mismatches that can be accommodated by reducing the stringency of 
the hybridization media to achieve the desired detection of the target polynucleotide' 
sequence. 

. - ; "Stringent hybridization conditions" and "stringent hybridization wash 

conditions" in the context of nucleic acid hybridization experiments such as Southern. and 

10 ' northern hybridizations are sequence dependent, and are different under different: 

environmental parameters Longer sequences hybridize specifically at higher temperatures,. 
■An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) 
Laboratory Techniques in Biochemistry and Molecular Biology-'Hybridizaiion with Nucleic ' 
Acid Probes part 1 chapter 2 ''Overview of principles of hybridization and the strategy of 

15 nucleic acid probe assays/' Elsevier, New York. Generally, highly stringent hybridization 
and wash conditions are selected to be about 5 ' C lo\yer than the thermal melting point (T„,) 
for the specific sequence at'a defined \mc strength and pH. Typically, under "stringent 
. ' conditions" a probe will hybridize to its target subsequence; but to no other sequences. ' 

. The Tm-is'the temperature (under defined ionic strength and. pH) at which ' 

20 5 0% of the target sequence hybridizes to .a perfectly matched probe. Vei-y stringent 

conditions are selected to be equal to the T^for a particular probe. An example. of stringent 
. hybridization conditions for hybridization of complementary nucl eic acids which have more 
than 100 complementary residues on a filter m a Southern or nprl hem blot is 50% 
' fprmamide with 1 mg of heparin at ATC, with the hybridization being carried out overnight. 

25 . An example of highly stringent wash conditions is Q. 1 5M NaCi at ITC for about 15 
minutes. : An example of stringent wash conditions is a 0.2x SSC wash at 65°C for, 15 - 
minutes {see, Sambrook, infra., for a description of SSC buffer). Often, a high stringency 
wash is preceded by a low stringency wash to remove background probe signal An example 
medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is Ix SSC at 45°C 

30 for 1 5 minutes. An example low stringency wash for a duplex of, e.g., more than 100 
nucleotides, is 4-6x SSC at 40*^C for 15 minutes. For short probes {e.g., about 10 to 50 
nucleotides), stringent conditions typically involve salt concentrations of less than about 1.0 
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M Na ion, typically about 0.01 to 1.0 M.Na ion concentration (or other salts) at pH 7.0 to 
8.-3, and the temperature is typically at least about 30°C. Stringent conditions can also be 
achieved with the addition of destabilizing agents such as formamide. In general, a signal to 
noise ratio of 2x (or higher) than that observed for an unrelated probe in the particular 
hybridization assay indicates detection of a specific hybridization. Nucleic acids which do 
not-hybridize-to-each-other-under-stringentxonditions-are-still'substaritia!ly-idenficarif'the" 



polypeptides which they encode are substantially identical. This occurs, e.g., when a copy of 
a nucleic acid is created using the maximum codon degeneracy permitted by the genetic 
code. 

• A further indication that two nucleic acid sequences or polypeptides are 
substantially identical is that the polypeptide encoded by the first nucleic acid is 
immunologically cross reactive with, or specifically binds to, the polypeptide encoded by the 
second nucleic acid. Thus, a polypeptide is typically substantially identical to a second 
polypeptide, for.example, where the two peptides differ only by conservative substitutions. 

The phrase "specifically (or selectivelv) binds to an antibody" or "specifically 
(or selectively) immunoreactive with," when referring to a protein or peptide, refers to a 
binding reaction which is determinative of the presence of the protein in the presence of a 
heterogeneous population of proteins and other biologies. Thus, under designated 
immunoassay conditions, the specified antibodies bind to a particular protein and do not bind 
in a significant amount to other proteins present in the sample. Specific binding to an 
antibody under such conditions may require an antibody that is selected for its specificity for 
a particular protein. For example, antibodies raised to the protein with the amino acid 
sequence encoded by any of the polynucleotides of the invention can be selected to obtain 
antibodies specifically immunoreactive with that protein and not with other proteins except 
for polymorphic variants. A variety of immunoassay formats may be used to select 
antibodies specifically immunoreactive with a particular protein. For example, solid-phase 
ELISA immunoassays. Western blots, or imraunohistochemistry are routinely used to select 
monoclonal antibodies specifically immunoreactive with a protein. See Harlow and Lane 
{19SS) Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, New York 
"Harlow and Lane"), for a description of immunoassay formats and conditions that can be 
used to determine specific immunoreactivity. Typically a specific or selective reaction will 
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be at least twice background signal or noise, and' more typically more than 10 to 100 times 
background. 

"Conservatively modified variations" of a particular polynucleotide sequence 
refers to those polynucleotides that encode identical or essentially identical amino acid 
sequences, or where the polynucleotide does not encode an amino acid sequence, to 
essentially identical sequences. Because of the degeneracy of the genetic code, a large 
number of functionally identical nucleic acidsencode any given polypeptide. For instance, 
the codons CGU, CGC, CGA, CGG, AGA, and AGG all encode the amino acid arginine. 
Thus, at every position where an arginine is specified by a codon, the codon can be altered to 
any of the corresponding codons described without altering the encoded polypeptide. Such 
nucleic acid variations are "silent variations," which are one species of "conservatively 
modified variations." 'Ever>' polynucleotide sequence described herein which encodes a 
polypeptide also describes every possible silent variation, except where otherwise noted. 
One of skill will recognize that each codon in a nucieic acid (except AUG, which is 
ordinarily the only codon for methionine) can be modified to yield a functionally identical 
molecule by standard techniques. Accordingly, each "silent variatioa'' of a nucleic acid 
which encodes a polypeptide is implicit in each described sequence. 

' Furthermore, one of skill will recognize that individual substitutions, 
deletions or additions which aher, add or delete a single amino acid or a small percentage of 
amino acids (typically less than 5%, more typically less than 1%) in an encoded sequence are 
"conservatively modified variations" where the aherations result in the substitution of an 
amino acid with a chemically similar amino acid. Conservative substitution 'tables providing 
functionally similar amino acids are well known in the art. The following five groups each 
contain amino acids that are conservative substitutions for one another; Ali phatic : Glycine 
(G), Alanine (A), Valine (V), Leucine (L), Isoleucine (I); Aromatic : Phenylalanine (F),' 
Tyrosine (Y), Tryptophan (W); Sulfur-containing : Methionine (M), Cysteine (C); Basic : 
Arginine (R), Lysine (K), Histidine (H); Acidic : Aspartic acid (D), Glutamic acid (E), 
Asparagine (N), Glutamine (Q). See a/^o, Creighton (1984) Proteins, W.H. Freeman and 
Company, In addition, individual substitutions, deletions or additions which alter, add or 
delete a single amino acid or a small percentage of amino acids in an encoded sequence are 
also "conservatively modified variations." 
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Two nucleic acids "correspond" when they have the same sequence, or when 
one nucleic acid is a subsequence of the other, or when one sequence is derived, by natural 
or artificial manipulation from the other. A nucleic acid corresponds to a protein when it 
encodes the protein or a substantial fragment of the protein (typically a fragment of at least 
about 5% of the protein). 

'. . A "subsequence" refers to a sequence of nucleic a ci ds or amino acids that 

comprise a part of a longer sequence of nucleic acids or amino acids (e.g., polypeptide) 
respectively. 

Nucleic acids are "elongated" when additional nucleotides (or other 
analogous molecules) are incorporated into the nucleic acid. Most commonly, this is 
performed with.a polymerase (e.g., a DNA polymerase), e.g., a polymerase which adds 
sequences at the 3' terminus of the nucleic acid. 

Two nucleic acids are 'Yecombined" when sequences from each of the two 
nucleic acids are combined in a progeny nucleic acid. Two sequences are "directlv^' 
recombined when both of the nucleic acids are substrates for recombination. Two sequences 
are "indirectly recombined" when the sequences are recombined using an intermediate such 
as a cross-over oligonucleotide. For indirect recombination, no more than one of the 
sequences is an actual substrate for recombination, and in some cases, neither sequence is a 
substrate for recombination. 

DETAILED DESCRIPTION 

I. rNTRQDUCTlON 

The present invention provides methods for evolving, /.e., modifying, a 
nucleic acid for the acquisition of, or an improvement in; a property or characteristic useful 
in conferring upon plants resistance to pests, including, but not iiniited to, insects, 
nematodes, fungi, and arachnids. The methods involve using DNA shuffling to obtain 
recombinant pest resistance genes that, when present in or on a plant, enhance the planrs 
defenses against a pest. The invention provides significant advantages over previously used 
methods for optimization of pest resistance genes. For example, DNA shuffling can result in 
optimization of a desirable property even in the absence of a detailed understanding of the 
mechanism by which the particular property is mediated. Sequence recombination can be 
achieved in many different formats and permutations of formats, as described in further 
detail-belowJfheseTormats-share-some common-principles. . 
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The substrates for this modification/ or evolution, vai^' in different 
applications, as does the property sought to be acquired or improved. Examples of candidate 
substrates for acquisition of a property or improvement in a property include genes that 
encode insecticidal proteins. The methods require at least two variant forms of a starting 
5 substrate. The variant forms of candidate substrates can, show substantial sequence or 
secondary structural similarity with each other, but they should also differ in at least two 
positions. The initial diversity between forms can be the resuh of natural variation, e.g:, the • 
different variant forms (homologs) are obtained from different individuals or strains of an 
organism (including geographic variants) or constitute related sequences from the same 

10' organism (e.g., allelic variations). Alternatively, the initial diversity can be induced, e.g., the 
second variant form can be generated by error-prone transcription, such as an error-prone 
PCR or use of a polymerase which lacks proof-reading activity (see, Liao (1990) Gene 
88: 107-1 11), of the first variant form, or, by' replication of the tTrst form in a mutator strain 
(mutator host cells are discussed in further detail below). The initial diversity between 

15 substrates is greatly augmented in subsequent sieps of recombination. 

The properties or characteristics that can be sought lo be acquired or 
improved var\' widely, and, of course depend on, the choice of substrate. For example, for 
. pest resistance genes, properties that one can improve include, but are not limited to, 
increased range'of pests against which a particular resistance gene is effective, increased 

20 potency against a, pest, delay or elimination of the ability of pests to develop resistance to the 
■ gene product, increased expression level of the resistance gene, increased resistance to 
protease degradation and to destabilizing conditions such as low pr.high pH, and reduced 
toxicity to the host plant. At least two variant forms of a nucleic acid v^hich can confer pest 
resistance are recombined to produce a library of recombinant pest resistance genes. The 

25 library is then screened to identify at least one recombinant pest resistance gene that is 

optimized for the particular property or properties of interest. The variant forms of candidate, 
pest resistance genes can have substantial sequence or secondary structural similarity with 
each other, but they should also differ in at least two positions. The initial diversity between 
forms can be the result of natural variation, e.g,, the different variant forms (homologs) are 

30 obtained from different individuals or strains of an organism (including geographic variants; 
termed "family shuffling") or constitute related sequences from the same organism (e.g., 
allelic variations). Alternatively, the initial diversity can be induced, e.g., the second variant 
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form can be generated by error-prone transcription, such as an error-prone PCR or use of a 
polymerase which lacks proof-reading activity Liao (1990) Gene 88: 107-11 1), of the 
first variant form, or, by replication of the first form in a mutator strain (mutator host cells 
are discussed in fiirther detail below). 

Often, improvemems are achieved after one round of recombination and 
selection. Howe yer^recursive-sequence-recombinationcah- be employed to achieve still 



) 



further improvements in a desired property. Recursive sequence recombination entails 
successive cycles of recombination to generate molecular diversity. That is, one creates a 
family of nucleic acid molecules showing some sequence identity to each other but differing 
in the presence of mutations. In any given cycle, recombmation can occur v/vo or r„ vttro 
mtracellularly or extracellularly. Furthermore, diversity resulting from recombination can be 
augmented in any cycle by applymg prior methods of mutagenesis (..^. . error-prone PCR or 
cassette mutagenesis) to either the substrates or products for recombmation. In some 
instances, a new or ■mpro^■ed property or charactenstic can be achieved after onlv a single 
cycle of vivo or m v,tro recombination, as when using different, variant forms of the 
sequence, such as homologs from different individuals or strains of an organism, or relat'ed 
sequences from the same organism, as allelic variations. ' ' 

A recombination cycle is usually followed by at least one cycle of screening 
or selection for molecules having a desired property or characteristic. If a recombination 
cycle is performed m v,tro, the products of recombination, i.e., recombinant segments, are 
sometimes imroduced into cells before the screening step. Recombinam segments can alsS 
be Imked to an appropriate vector or other regulatory sequences before screening. 
Alternatively, products of recombination generated m vitro are sometimes packaged-as - - 
virtises before screening. If recombination is performed in v/va„recombination products can 
sometimes be screened in the cells in which recombination occun-ed. In otiier applications, 
recombinant segments are extracted from the cells, and optionally packaged as virtises, 
before screening. • 

The nature ofscreening or selection depends on what property or 
characteristic is to be acquired or the property or characteristic for which improvement is 
sought, and many examples are discussed below. It is not usually necessary to understand 
the molecular basis by which particular products of recombination (recombinant segments) 
have acquired new or improved prope rties or characteristics relative to the startin g 
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substrates. For example, a pest resistance gene can have many component sequences each 
' having a different intended role (e.g., coding sequence, regulatory sequences, targeting 
sequences, stability-conferring sequences, and sequences affecting integration). Each of 
these component sequences can be varied and recombined simultaneously. 
5. Screening/selection can then be performed, for example, for recombinant- segments that have 
increased ability to confer pest resistance upon a plant without the need to attribute such 
improvement to any of the individual component sequences of the vector 

Depending on the particular screening protocol used for a desired property, 
initial round(s) of screening can sometimes be performed using bacterial cells due to high 
10 . transfection efficiencies and ease of culture. Later rounds, and other types of screening 
; w^hich are not amenable to screening in bacterial cells, are performed in plant cells to' 

optimize recombinant segments for use in an environment close to that of their intended use. 
Final rounds of screening can be performed in the precise cell, type of intended use {e.g., a 
cell which js present in a plant). In some methods,' use of a recombinant pest resistance gene 
15 can itself be used as-a round of screening. That is, recombinant pesr resistance-genes that are 
successfully taken up and/or expressed by the intended target cells are recovered from those 
target cells and. used to confer resistance upon other plants. The recombinant pest resistance 
genes that are recovered from the first target cells are enriched for genes that 'have evolved, 
' /.^., have been modified by recursive .sequence recombinatidh, toward improved or new 
20 properties or characteristics for specific uptake and integration of the gene^ effectiveness 
against the pest, stability, and the like. ' 

' The screening or selection step identifies a subpoputation of recombinant 
segments that have evolved toward acquisition of a nev^ or improved desired property or 
properties useful in'conferring pest resistance upon plants. Depending on the screen, the 
25 recombinant segments can be identified as components of cells, components of viruses or in 
' free form. More than one round of screening. or selection can be performed: after each round ^ 
of recombination. ^ 

If further improvement in a property is desired, at least one and usually a 
collection of recombinant segments surviving a first round of screening/selection are subject 
30 to a further round of recombination. These recombinant segments can be'recombined with 
each other or with exogenous segments representing the original substrates or further 
variants thereof Again, recombination can proceed in vitro or in vivo. If the previous 
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screening step identifies desired recombinant segments as components of cells, the 
components can be subjected to further recombination i?2 vivo, or can be labjected to further 
recombination in vitro, or can be isolated before performing a round of />; vitro 
recombination. Conversely, if the previous screening step identifies desired recombinant 
5 segments in naked forrii or as components of viruses, these segments can be introduced into 

-cellsJo„perfbnn_aTOund-of/>/^v/w-recombination— The second-round-ofre 

irrespective how performed, generates further recombinant segments which encompass 
^ additional diversity than is present in recombinant segments resulting from previous rounds. 

The second round of recombination can be follov/ed by a further round of 
10 screening/selection according to the principles discussed above for the first round. The 
stringency of screening/selection can be increased between rounds. Also, the nature of the 
screen and the property being screened for can vary between rounds if improvement in more 
than one property is desired or if acquiring more than one new property is desired. 
. Additional rounds of recombination and screening can then be performed until the 
15 recombinant segments have sufficiently evolved to acquire the desired new or improved 
property or function. 

The practice of this invention involves the construction of recombinant 
nucleic acids and the expression of genes in transfected host cells. iMolecular cloning 
techniques to achieve these ends are known in the art. A wide variety of cloning and in . vitro 
20 amplification methods suitable for the construction of recombinant nucleic acids such as 
expression vectors are well-known to persons of skill. Examples of these techniques and 
instructions sufficient to direct persons of skill through many cloning exercises are found in 
Sambrook a/. {\9S9) Molecular Cloning: A Laboratory^ Manual, 2nd Ed., Vols. Cold 
Spring Harbor Laboratory ("Sambrook"), Berger and Kimmel, Guide to Molecular Cloning 
25 Techniques, Methods in Enzymology volume 152 Academic Press, Inc., San Diego, CA 

("Berger"); and Current Protocols in Molecular Biology, F.M, Ausubel et aL, eds., Current 
Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & 
. Sons, Inc. , ( 1 994 Supplement) ("Ausubel"). 

II FORMATS FOR SEQUENCE RECOMBINATION 
30 The methods of the invention entail performing recombination ("shuffling") 

and screening or selection to "evolve" individual genes, whole plasmids or viruses, 
multigene clusters, or even whole genomes (Stemmer (1995) Bio/Technology 13:549-553). 
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Reiterative cycles of recombination and screening/selection.can be performeid to further 
evolve the nucleic acids of interest. Such techniques do not require the extensive analysis 
'and computation required by conventional methods for polypeptide engineering. Shuffling 
allows the recombination of large numbers of mutations in a minimum number of selection 
5 cycles, in contrast to natural pairwise recombination events (e.g., as occur during sexual 
replication). Thus, the sequence recombination techniques described, herein provide 
particular advantages in that they provide recombination between mutations in any or all of 
these, thereby providing a very fast way of exploring the manner in which different 
combinations of mutations can affect a desired result: In some instances, however, structural 

10 and/or functional information is available which, although not required for sequence 
recombination, provides opportunities for modification' of the technique. 

A number of publications by the inventors and their co-workers describe 
DNA shuffling. Stemmeretal. (1994) ^'Rapid Evolution of a Protein^' Nature 3.70:389o91; 
Stemmer (1994) ''DNA Shurlling by Random Fragmentation and Reassembly: in vitro 

15 . Recombination for Molecular Evolution;' Proc. Natl.' Acad. USA 91 i/i 0747-1075 i; Stemmer. 
U.S. Patent No. 5,603,793 METHODS FOR IN VITRO RECOMBINATION; Stemmer et 
al. U.S. Pat. No. 5,830,721 DNA .MUTAGENESIS BY RAjNDOM FRAGMENTATION 
AND REASSEMBLY and Stemmer et al. U.S. Pat: No. 5,811,238 METHODS FOR 
GENERATING POLYNUCLEOTIDES HAVING DESIRED C^^AR.ACTERISTICS BY 

20 ITERATIVE SELECTION AND RECOMBINATION describe e.g., in vitro protein 

shuffling methods,' e.g., by repeated cycles of mutagenesis, shuffling and selection as well as 
a variety of methods of generating libraries of displayed peptides and antibodies and a 
variety of DNA reassembly techniques following DNA fragmentation, and tHeir application 
to mutagenesis //7 v/Tro and />? v/vo. 

25 ' Applications of DNA shuffling technology have also been developed by the 

inventors.and their co-wprkers. In addition to the publications noted abovej Mjnshull et aL, 
U:S. Pat. No. 5,837,458 METHODS AND COMPOSITIONS FOR CELLULAR AND 
METABOLIC ENGINEERING provides for the evolution of new metabolic pathways and 
the enhancement of bio-processing through recursive shuffling techniques. Crameri et al. 

30 (1996), "Construction And Evolution Of Antibody-Phage Libraries By DNA Shuffling" 
Nature Medicine 2(1): 100-103 describe antibody shuffling for antibody phage libraries. 
Additional details regarding DNA Shuffling can also be found in W095/22625, W097/ 
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20078, WO96/33207, W097/33957, WO98/27230, W097/35966, W098/ 3 1837, 
W098/13487, W098/13485 and W0989/42832. 

A number of the publications of the inventors and their co-workers, as well as 
other investigators in the art also describe techniques which facilitate DNA shuffling, e.g., 
by providing for reassembly of genes from small fragments of genes, or even 
oligonucleotides encoding gene fra gments. For example, in addition tn.fh^.publicatiens- - - 



noted above, Stemmer et al. (1998) U.S. Pat. No. 5,834,252 END COMPLEMENTARY 
POL^-MERASE REACTION describe processes for amplifying and detecting a target 
sequence (e.g., in a mixture of nucleic acids), as well as for assembling large polynucleotides 
from fragments. 

Creation of Recombinant Libraries 

The invemion involves creating recombinant libraries of polynucleotides that 
are then screened to idemify those library members rhat exhibit a desired property, e.g. , 
which encode insecticidal activity. The recombinant libraries can be created using any of the 
various methods herein, as well as many others which would be apparent to one of skill. 

Methods for obtaining recombinam polynucleotides and./or for obtaining 
diversity in nucleic acids used as the substrates for DNA shuffling as described below ' 
mclud'e, for example, homologous recombination (PCT,/US98/05223; Pub!. No. • 
W098/42727); oligonucleotide-directed mutagenesis (for review see. Smith, Ann. Rev. 
Genet. 19: 423-462 (1985); Botstein and Shortle, Science 229: 1 193-1201 (1985); Carter, ■ 
Biochem. J. 237: 1-7 (1986), Kunkel, "The efficiency of oligonucleotide directed 
mutagenesis" in Nucleic acids & Molecular Biology, Eckstein and Lilley, eds , Springer 
Verlag, Berlin ( 1 987)). Included among these methods are oligonucleotide-directed 
mutagenesis (Zoller and Smith, Nucl. Acids Res. 10: 6487-6500 {mi). Methods in Enzymol. " 
100: 468-500 (1983), and Methods in Enzymol. 154: 329-350(1987)) phosphothioate- 
modified DNA mutagenesis (Taylor et al., Nucl. Acids Res. 13: 8749-8764 (1985); Taylor et 
al., Nucl. Acids Res. 13: 8765-8787 (1985); Nakamaye and Eckstein, A^^/c/. /Ic/ci-/?e5. 14: 
9679-9698 (1986); Sayers el al., Nucl. Acids Res. 16: 791-802 (1988); Sayers etal, Nucl 
Acids Res. 16: 803-814 (1988)), mutagenesis using uracil-comaining templates (Kunkel, 
Proc. Nat 1 Acad Sci. USA 82: 488-492 (1985) and Kunkel et al, Methods in Enzymol. 154: 
367-382)); mutagenesis using gapped duplex DNA (Kramer et aL,Nucl. Acids Res. 12: 
9441-9456 (1984); Kramer and Fritz, Methods in Enzymol. 154: 350-367 (1987); Kramer et 

Acids Res. 16: 7207(1988)); and Vtxtzet al.,Nuci Acids Res: 16: 6987-6999 
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(1988)).' Additional suitable methods include point mismatch repair (Kramer et al, Cell3S: 
879-887 (1 984)), mutagenesis using repair-deficient host strains (Carter al. , Nucl Acids ^ 
Res, 13; 4431-^1443 (1985), Cdsitx, Methods in Enzymoi 154: 382:403 (1987)), deletion • 
mutagenesis (Eghtedarzadeh and Henikoff, NucL Acids Res, 14: 5115 (1986)), restriction- 
5 selection, and restriction-purification (Wep 

(1986)), mutagenesis by total gene synthesis (Nambiar et ai. Science 223: 1299-1301 
. (1984), Sakamar and Khorana,Mvc/.^67A/?£^5. 14: 6361-6372 (1988); Wells e/ a/., to 
^ 34: 315-323 (1985); and Grundstrome/ a/,, M/c/./lc/^/.s/?e^ 13: 3305-3316 (1985). Kits for 
.mutagenesis are commercially available (e.g., Bio-Rad, Amersham International, Anglian 

, 10 \ Biotechnology). 

' In a presently preferred embodiment, the recombinant libraries are prepared 

using DNA shuffling'. The shuffling and screening or selection cin be used to "evolye" 
individual genes, whole plasmldsor viruses, multigene clusters, or even whole genomes 
(Stemmer,(1995)/?/o/Tec'/7;/oto^^^^^^ . ' / ■ 

'15 ' Reiterativecycles of recombination and screening/selection can be 

performed to further evolve the nucleic acids of interest. Such techniques do not require the 
extensive analysis and computation required by conventional methods for polypeptide 
engineering. Shuffling allows the recombination of large numbers, of mutations irf a 
minimum number of selection cycles, in contrast to traditional, painviser ^ 
20 events. Thus, the sequence- recombination techniques described herein provide particular - 
advantages in that they provide recombination between mutations in any or all of these, 
thereby providing a very fast way of exploring the manner in which different combinations 
' of mutations: can affect a desired resuh. In some instances, however, structural and/or 
functional information is available which, although not required for sequence recombination, 
25 provides opportunities for modification of the technique/ 

Exemplary formats and examples for sequence recombination, sometimes 
^ referred to^ as DlSl A shuffling, evolution, or molecular breeding; have been described by the ' 
present inventors and co-workers in co-pending applications U.S. Patent Application Serial 
No. 08/198,431, filed February 17, 1994, Serial No: PCT/US95/02126, filed, February 17, 
30 1995, Serial No 08/425,684, filed April 18, 1995, Serial No. 08/537,874, filed October 30, 
1995, Serial No. 08/564,955, filed November 30, 1995, SerialNo. 08/621,859, filed March 
25, 1996, Serial No. 08/621,430, filed March 25, 1996, Serial No. PCT/US96/05480, filed 
April 18, 1996, SerialNo. 08/650,400, filed May 20, 1996, Serial No. 08/675,502, filed July 
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3, 1996, Serial No. 08/721, 824, filed September 27, 1996, Serial No. PCT/US97/17300, 
filed September 26, 1997, and Serial No. PCT/US97/24239, filed December 17, 1997; 
Stemmer, Science 270:1510 (1995), Stemmer a/.. Gene 164:49-53 (1995); Stemmer, 
Bio/Technology 13:549-553 (1995); Stemmer, Proc. NaiL Acad ScL U.S.A. 91:10747-10751 



5 (1994); Stemmer, Nature 370:389-391 (1994), Crameri ei al.. Nature Medicine 2(1): 1-3 
(1996); Crameri et al.. Nature Biotechnology 14:3 15-3 19 (1 996), each of which is , 





incorporated by reference in its entirety for all purposes. 




ADDITIONAL SHUFFLING FORMAT INFORMATION 




The methods of the invention entail performing recombination ("shuffling") 
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and screenins or selection to ''evolve" individual eenes whole niasmids or virus;p<; 




multigene clusters, or even whole genomes (Stemmer (1995) Bio/Technology 13:549-553). 




Reiterative cycles of recombination and screening'selection can be performed to fijrther 




evolve the nucleic acids of interest. Such techniques do not require the extensive analysis 




and computation required by conventional methods for polypeptide engineering. Shuffling 
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allows the recombination of large numbers of mutations in a minimum number of selection ^'v. 




cycles, in contrast to traditional^ pairwise recombination events. Thus, the sequence % 




recombination techniques described herein provide particular advantages in that they provide "^'^ 




recombination bet^veen mutations in any or all of these, thereby providing a very fast way of 




exploring the manner in which different combinations of mutations can affect a desired i 


20 


result. In some instances, however, structural, and/or functional information is available 




which, although not required for sequence recombination, provides opportunities for 




modification of the technique. 
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referred to as DNA shuffling, evolution, or molecular breeding, have been described by the 
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present inventors and co-workers in the following patents and patent applications: US Patent 




No. 5,605,793; PCT Application WO 95/22625 (Serial No. PCT/US95/02126), filed 




February 17, 1995, US Serial No. 08/425,684, filed AprillS, 1995; US Serial No. . 




08/621,430, filed March 25, 1996; PCT Application WO 97/20078 (Serial No. 




PCT/US96/05480), filed April 18, 1996; PCT Application WO 97/35966, filed March 20, 


30 


1997; US Serial No. 08/675,502, filed July 3, 1996; US Serial No. 08/721, 824, filed 




September 27, 1996; PCT Application WO 98/13487, filed September 26, 1997; Stemmer, 




Science 270:1510 (1995); Stemmer et al.. Gene 164:49-53 (1995); Stemmer, Bio/Technolosv 
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13:549-553 (1995); Stemmer, Proc. Nail. Acad Sci. US.A. 91; 10747:1075 1 (1994); 
Stemmer, Nature 370;389-391 (1994); Crameri Nature A^edicitie 2i\):\-3 (1996), 
Crameri et ai. Nature Biotechnology 14:315-319 (1996), each of which is incorporated, by 
reference in its entirety for all purposes. 

The breeding procedure starts with at least two substrates that generally show 
substantial sequence identity to each other {i.e., at least about 30%, 50%, 70%^ 80% or 90% 
sequence identity), but differ from each other at certain positions. The difference can be any 
type of mutation, for example, substitutions, insertions and deletions. Often, different 
segments differ from each other in perhaps 5-20 positions. For recombination to generate 
increased diversity relative to the starting materials, the starting materials must differ from 
each other in at least two nucleotide positions. That is, if there are only two substrates, there 
should be at least two divergent positions. If there are three substrates, for example, one 
substrate can differ from the second as a single position, and the second can differ from the 
third at a different single position. The starting DN.A segments can be natural variants of 
each other, for example, allelic or species variants. The segments can also be from 
nonallelic genes showing some degree of structural and usually functional relatedness {e.g., 
different genes within a superfamily such as the Bacillus thuririgiensis toxin family). The 
starting DNA.segmeiits can also be induced variants of each other. For example, one DNA 
segment can be produced by error-prone PCR replication of the other, or by substitution of a 
mutagenic cassette. Induced mutants can also be prepared by propagating one (or both) of 
the segments in a mutagenic strain. In these situations, strictly speaking, the second DNA 
segment is not a single segment but a large family of related segirients. The different 
segments forming the starting materials are often the same length or substantially the same 
length. However, this need not be the case; for example; one segment can be a subsequence 
of another. The segments can be present as part of larger molecules, such as vectors, or can 

be in isolated form. 

The starting DNA segments are recombined by any of the sequence 
recombination formats provided herein to generate a diverse library of recombinant DNA 
segments. Such a library can vary widely in size from having fewer than 10 to more than 
10^ 10', or 10'^ members. In some embodiments, the starting segments and the recombinant 
libraries generated will include flill-length coding sequences and any essential regulatory 
sequences, such as a promoter and polyadenylation sequence, required for expression. In 
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Other embodiments, the recombinant DNA segments in the library can be inserted into a 
common vector providing sequences necessary for expression before performing 
screening/selection. 

A. Use of Restriction Enzyme Sites to Reconibine Mutations 
5 In some situations it is advantageous to usejestriction enzyme sites in nucleic 

acids to direct the recom bination,of„mutationsJn-a^nucleiG^adH^qpqHfinnf^-nfMntPrpgf-T-h^ 
techniques are particularly preferred in the evolution of fragments that cannot readily be 
shuffled by existing methods due to the presence of repeated DNA or other problematic 
primary sequence motifs. These situations also include recombination formats in which it is 
10 preferred to retain certain sequences unmutated. The use of restriction enzyme sites is also 
preferred for shuffling large fragments (typically greater than 10 kb), such as gene clusters 
that cannot be readily shuffled and "PCR-amplified" because of their size. Although 
fragments up to 50 kb have been reported to be amplified by PGR (Barnes, Proc. Natl. Acad. 
ScL U.S.A. 91:2216-2220 (]994)X it can be problematic for fragments over 10 kb, and thus 
1 5 alternative methods for shuffling in the range of 10 - 50 kb and beyond are preferred, 
Preferably, the restriction endonucleases used are of the Class II type ('Sambrook etal., 
Molecular Cloning, CSH Press, 1987) and of these, preferably those which generate 
• nonpalindromac sticky end overhangs such as Alvvn I, Sfi I or BstXI. These enzymes 
generate nonpalindromic ends that allow for-efFicient ordered reassembly with DNA ligase. 
20 Typically, restriction enzyme (or endonuclease) sites are identified by conventional 

restriction enzyme mapping techniques (Sambrook ei ai, supra.), by analysis of sequence 
information for that gene, or by introduction of desired restriction sites into a nucleic acid 
sequence by synthesis {i.e. by incorporation of silent mutations). 

The DNA substrate molecules to be digested can either be from in vivo 
25 replicated DNA, such as a plasmid preparation, or from PGR amplified nucleic acid 

fragments harboring the restriction enzyme recognition sites of interest, preferably near the 
ends of the fragment. Typically, at least two variants of a gene of interest, each having one or 
more mutations, are digested with at least one restriction enzyme determined to cut within 
the nucleic acid sequence of interest. The restriction fragments are then joined with DNA 
30 ligase to generate fiill length genes having shuffled regions. The number of regions shuffled 
will depend on the number of cuts within the nucleic acid sequence of interest. The shuffled 
molecules can be introduced into cells as described above and screened or selected for a 
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desired property as described herein. Nucleic acid can then be isolated from pools (libraries) 

or Clones having desired properties and subjected to the same procedure until a desired 

degree of improvement is obtained. 

In some embodiments, at least one DN A substrate molecule or fragment 

thereof is isolated and subjected to mutagenesis. In some embodiments, the pool or library of 
religated restriction fragments are subjected to mutagenesis before the digestion-ligation . 
process is repeated. "Mutagenesis" as used herein comprises such techniques known in the 
art as PCR mutagenesis, oligonucleotide-directed mutagenesis, site-directed mutagenesis, 
etc., and recursive sequence recombination by any of the techniques described herein. 
B. Reassembly PCR • 
A further technique for recombining mutations in a nucleic acid sequence 
utilizes "reassembly PCR". This method can be used to assemble multiple segments that 
have been separately evolved into a full length nucleic acid template such as a gene, This 
technique is performed when a pool of advantageous mutants is known from previous work 
or has been identified by screening mutants that may have been created by any mutagenesis 
technique known in the art, such as PCR mutagenesis, cassette mutagenesis, doped oligo 
mutagenesis, chemical mutagenesis, or propagation of the DN A template in vivo in mutator 
strains. Boundaries defining segments of a nucleic acid sequence of interest preferably lie in 
intergenic regions, introns, or areas of a gene not likely to have mutations of interest. 
Preferably, oligonucleotide primers (oligos) are synthesized for PCR amplification of 
segments of the nucleic acid sequence of interest, such that the sequences of the 
oligonucleotides overlap the junctions of two segments. The overlap region is typically about 
10 to 1 00 nucleotides in length. Each of the segments is amplified with a set of such primers. 
The PCR products are then "reassembled" according to assembly protocols such as those 
discussed herein to assemble randomly fragmented genes. In brief, in aii assembly protocol 
the PCR products are first purified away from the printers, by, for example, gel 
electrophoresis or size exclusion chromatography. Purified pi-oducts are mixed together and 
subjected to about 1-10 cycles of denaturing, reannealing, and extension in the.presence of 
polymerase and deoxynucleoside triphosphates (dNTP's) and appropriate buffer salts in the 
absence of additional primers ("self-priming"). Subsequent PCR with primers flanking the 
gene are used to amplify the yield of the fully reassembled and shuffled genes. 
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* In some embodiments, the resulting reassembled genes are subjected to 

mutagenesis before the process is repeated. • 

In a further embodiment, the PGR primers for amplification of segments of 

the nucleic acid sequence of interest are used to introduce variation into the gene of interest 

as follows. Mutations at sites of interest in a nucleic acid sequence are identified by 

"WeeiTirrg~oT~selectiOT7b 

Oligonucleotide PGR primers are then synthesized which encode wild type or mutant 

information at sites of interest. These primers are then used in PGR mutagenesis to generate 

libraries of full length genes encoding permutations of wild type and mutant information at 

the designated positions. This technique is typically advantageous in cases where the 

screening or selection process is expensive, cumbersome, or impractical relative to the cost 

of sequencing the genes of mutants of interest and synthesizing mutagenic oligonucleotides, 

C Site Directed Mutagenesis (SDM) with Oligonucleotides Encoding 
Homologue Mutations Followed bv Shuffling 

■In some embodiments ot the invention, sequence information from one or 
more substrate sequences is added to a given "parentar' sequence of interest, with 
subsequent recombination between rounds of screening or selection. Typically, this is done 
with site-directed m.utagenesis performed by techniques we!! Icnown in the art (Sambroot: ei 
ai, siipra.) with one substrate as template and oligonucleotides encoding single or multiple 
mutations from other substrate sequences, e.g. homologous genes. After screening or 
selection for an improved phenotype of interest, the selected recombinant(s) can be further 
evolved using RSR techniques described herein. After screening or selection, site-directed 
mutagenesis can be done again with another collection of oligonucleotides encoding 
homologue mutations, and the above process repeated until the desired properties are 
obtained. 

' : , When the difference between two homologues is one or more single point 
mutations in a codon, degenerate oligonucleotides can be used, that encode the sequences in 
both homologues. One oligonucleotide can include many such degenerate codons and still 
allow one to exhaustively search all permutations over that block of sequence. 

' When the homologue sequence space is very large, it can be advantageous to 
restrict the search to certain variants. Thus, for example, computer modeling tools (Lathrop 
etal. (1996) J.MoL 255: 641-665) can be used to model each homologue mutation 
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onto the target protein and discard any mutations that are predicted to grossly disrupt, 
structure and function. 

D. In Vitro DNA Shuffling Formats 

One embodiment for shuffling DNA sequences />2 vitro is illustrated in Figure 
1'. The initial substrates for recombination are a pool of related, sequences, e.g., different, 
variant fomls, as homologs from different individuals, strains, or species of an organism, or 
related sequences from the same organism, as allelic variations. The X's in Figure 1, panel 
A, show where the sequences diverge. The sequences can be DNA or RNA and can be of 
various lengths depending on the size of the gene or DNA fragment to be recombined or 
reassembled. Preferably the sequences are from .50 base pairs (bp) to 50 kilobases (kb). 

The pool of related substrates are converted into overlapping fragments, e.g., 
from about 5 bp to 5 kb.or more, as shown in Figure 1, panel B. Often, for example, the size 
of the fragments is from about 10 bp to lOOQ bp, and sometimes the size of the DNA 
fragments is from about 100 bp to 500 bp; The conversion can be effected by a number of 
different methods, such as DNase I orRNase digestion, random shearing or partial 
restriction enzyme digestion. For discussions. of protocols for the isolation,, manipulation, 
enzymatic digestion, and the like of n-ileic acids, see, for example, Sambrook etciL and 
Ausubel, both mpra. The concentration of nucleic acid fragments of a particular length and 
sequence is often less than 0. 1 % or 1% by weight of the total nucleic acid. The number of 
different specific nucleic acid fragments in the mixture is usually at least about 100,. 500 or 
1000. * 

The mixed population of nucleic acid fragments are converted to at least 
partially single-stranded form using a variety of techniques, including, for example, heating, 
chemical denaturation, use of DNA binding proteins, and the like. Conversion can be 
effected by heating to about 80°C to 100°C, more preferably from 90°C to 96°C, to form 
single-stranded nucleic acid fragments and theri reannealing. Conversion :can also be 
effected by treatment with single-stranded DNA binding protein (see Wold (1997)^ wwz/. 
Rev. Biochem. 66:61-92) or recK protein {see^ e.g^ Kiianitsa (1997) Proc. Natl. Acad. ScL 
USA 94:7837-7840). Single-stranded nucleic acid fragments having regions of sequence 
identity with other single-stranded nucleic acid fragments can then be reannealed by cooling 
to 20^C to 75°C, and preferably from 40°C to 65°C. Renaturation can be accelerated by the 
addition of polyethylene glycol (PEG), other volume-excluding reagents or salt. The salt 
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concentration is preferably from 0 mM to 200 mM, more preferably the salt concentration is 
from 10 mM to 100 mM. The salt may be KCl or NaCl. The concentration of PEG is 
preferably from 0% to 20%, more preferably from 5% to 10%. The fragments that reanneal 
can be from different substrates as shown in-Figure 1, panel C. The annealed nucleic acid 
5 fragments are incubated in the presence of a nucieic acid polymerase, such as Taq or 

KJenow,^and-dNTR's-(/.e.-dATP,-dCTP,-dGTILanddTTR)^.J^^ 

are large, Taq polymerase can be used with an annealing temperature of between 45-65°C. 
If the areas of identity are small, Klenow polymerase can be used with an annealing 
temperature of between 20-30°C. The polymerase can be added to the random nucleic acid 
10 fragments prior to annealing, simultaneously with annealing or after annealing 

The process of denaturation, renaturation and incubation in the presence of 
polymerase of overlapping fragments to generate a collection of polynucleotides containing 
different permutations of fragments is sometimes referred to as shuffling of the nucleic acid 
in vitro. This cycle is repeated for a desired number of times. Preferably the cycle is 
15 repeated from 2 to 100 times, more preferably the sequencers repeated from 10 to 40 times. 
The resulting nucleic acids are a family of double-stranded polynucleotides of from about 50 
bp to about 100 kb, preferably from 500 bp to 50 kb, as shown in Figure 1 , panel D. The 
population represents variants of the starting substrates showing substantial sequence 
identity thereto but also diverging at several positions. The population has many more 
20 members than the starting substrates. The population of fragments resulting from shuffling 
is used to transform host cells, optionally after cloning into a vector. 

In one embodiment utilizing in vitro shuffling, subsequences of 
recombination substrates can be generated by amplifying the full-length sequences under 
conditions which produce a substantial fraction, typically at least 20 percent or more, of 
25 incompletely extended amplification products. Another embodiment uses random primers to 
prime the entire template DNA to generate less than full length amplification products. The 
amplification products, including the incompletely extended amplification products are 
denatured and subjected to at least one additional cycle of reannealing and amplification. 
This variation, in which at least one cycle of reannealing and amplification provides a 
30 substantial fraction of incompletely extended products, is termed "stuttering." In the 
subsequent amplification round, the partially extended (less than full length) products 
reanneal to and prime extension on different sequence-related template species. In another 
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embodiment, the conversion of substrates to fragments can be effected by partial PGR 
amplification of substrates. 

In another embodiment, a mixture of fragments is spiked with one or more 
oligonucleotides. The oligonucleotides can be designed to include precharacterized 
mutations of a wildtype sequence, or sites of natural variations between individuals or 
species. The oligonucleotides also include sufficient sequence or structural homology 
flanking such mutations or variations to allow annealing with the wildtype fragments. 
Annealing temperatures can be adjusted depending on the length of homology. 

In a further embodiment, recombination occurs in at least one cycle by 
template switching, , such as when a DNA fragment derived from one template primes on the 
homologous position of a related but different template. Template switching can be induced 
by addition of recA {see^ Kiianitsa (1997) supra), rad5 1 {see, Namsaraev (1 997) MoL Cell. 
Biol. 17:5359-5368), rad55(5ee, Clever (1997) / 16:2535-2544), rad57 (^ee-, Sung 

(1997) Genes Dev. 11:1111-1 121) or other polymerases viral polymerases, reverse 
transcriptase) to the amplification mixture. Template switching can also be increased by 
increasing the DNA template concentration. 

Another embodiment utilizes at least one cycle of amplification, which can be 
conducted using a collection of overlapping single-stranded DNA fragments of related 
sequence, and different lengths. Fragments can be prepared using a single stranded DNA 
phage, such as M13 {see, Wang (1997) 5/oc/7e/w/5/rv 36:9486-9492). Each fragment can 
hybridize to and prime polynucleotide chain extension of a second fragment from the 
collection, thus forming sequence-recombined polynucleotides. In a fiarther variation, ' 
ssDNA fragments of variable length can be generated from a single primer by Pfu, Taq, 
Vent, Deep Vent, UlTma DNA polymerase or other DNA polymerases on a first DNA 
template {see^ Cline (1996) Nucleic Acids Res. 24:3546-3551). The single stranded DNA ■ 
fragments are used as primers for a second, Kunkel-type template, consisting of a 
uracil-containing circular ssDNA. This results in multiple substitutions of the first template 
into the second. See, Levichkin (1995) Mol Biology 29:572-577; Jung (1992) Gem 
121:17-24. 

In some embodiments of the invention, shuffled nucleic acids obtained by use 
of the recursive recombination methods of the invention, are put into a cell and/or organism 
for screening. Shuffled insect resistance genes can be introduced into, for example, bacterial 
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cells, yeast cells, or plant cells for initial screening. Bacillus species (such as B. stibtilis and 
E. coli are two examples of suitable bacterial cells into which one can insert and express 
shuffled insect resistance genes. The shuffled genes can be introduced into bacterial or yeast 
cells either by integration into the chromosomal DNA or as plasmids. Shuffled genes can 
5 also be introduced into plant cells for screening purposes. Thus, a transgene of interest can 
be modified usin g the recursive sequence_recombination-method^-nf-thfi-invpntinn-//?M.7/rn 
and reinserted into the cell for in vivo/in situ selection for the new or improved property. 
E. In Vivo DNA Shuffling Formats 

In some embodiments of the invention, DNA substrate molecules are 

10 introduced into cells, wherein the cellular machinery directs their recombination. For 
example, a library of mutants is constructed and screened or selected for mutants with 
improved phenotypes by any of the techniques described herein. The DNA substrate 
molecules encoding the best candidates are recovered by any of the techniques described 
herein, then fragmented and used to transfect a plant hosi and screened or selected for 

15 improved ifiinction. If funher improvement is desired, the DNA substrate molecules are 
recovered from the plant host celL such as by PCR, and the .process is repeated until a 
desired level of improvement is obtained. In some erabodiments, the fragments are 
denatured and reannealed prior to transfection, coated with recombination stimulating 
proteins such as recA, or co-transfected with a selectable marker such as Neo^ to allow the 

20 positive selection for cells receiving recombined versions of the gene of interest. iMethods 
for in vivo shuffling are described in, for example, PCT application WO 98/13487. 

The efficiency of in vivo shuffling can be enhanced by increasing the copy 
number of a gene of interest in the host cells. For example, the majority of bacterial cells in 
" stationary phase cultures grown in rich media contain tvvo, four or eight genomes. In. 

25 ' minimal medium the cells contain one or two genomes. The number of genomes per 

bacterial cell thus depends on the growth rate of the cell as it enters stationary phase. This is 
because rapidly growing cells contain multiple replication forks, resulting in several 
genomes in the cells after termination. The number of genomes is strain dependent, although 
, all strains tested have more than one chromosome in stationary phase. The number of 

30 genomes in stationary phase cells decreases with time. This appears to be due to 

fragmentation and degradation of entire chromosomes, similar to apoptosis in mammalian 
cells. This fragmentation of genomes in cells containing multiple genome copies results in 
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massive recombination and mutagenesis. Tiie presence of multiple genome copies in such 

cells results in a higher frequency Of homologous recombination in these cells, both between 

copies of a gene in different genomes within the cell, arid between a genome within the cell 

and a transfected fragment. The increased frequency of recombination . allows one to evolve 

a gene evolved more quickly to acquire optimized characteristics. 

In nature, the existence of multiple geriomic copies in a cell type would 

usually not be advantageous due to the greater nutritional requirements needed to maintain 
this copy number. However, artificial conditions can.be devised to select for high copy 
nuihber. Modified cells having recombinant genomes are grown in rich media (in which 
conditions, multicopy number should not be a disadvantage) and exposed to a mutagen, such 
as ultraviolet or gamma irradiation or a chemical mutagen, e.g., mitomycin, nitrous acid, 
photoactivated psoralens, alone or in combination, which induces DNA breaks amenable to , 
repair by recombination. These conditions select for cells having multicopy number due to 
the greater efficiency with which mutations can be e.xcised. Modified ceils surviving 
exposure to mutagen are enriched for cells with multiple genome, copies, If desired, selected , 
cells can be individually analyzed for genome copy number (e.g., by quantitative 
hybridization with appropriate controls). For example, individual cells can be sorted using a 
cell sorter for those cells containing more DN.\, e.g., using DNA specific fluorescent 
compounds or sorting for increased size using light dispersion. Some or all of the collection 
of cells surviving selection are tested for the presence of a gene that is optimized for the 
desired property. 

F Whole Genome Shuffling 
■ , In one embodiment, the selection methods herein are utilized in a 'Svhole 

genome shuffling" format. An extensive guide to the many forms of whole genome 
shuffling is found in the pioneering application to the inventors and their co-workers entitled 
'EVOLUTION OF WHOLE CELLS AND ORGANISMS BY RECURSIVE SEQUENCE 
RECOMBINATION,^' Attorney Docket No. 018097-020720US filed July 15, 1998 by del 
Cardayreet al. (USSN 09/116188). 

In brief, whole genome shuffling makes no presuppositions at all regarding 
what nucleic acids may confer a desired property. Instead, entire genomes (e.g., from a 
genomic library, or isolated from an organism) are shuffled in cells and selection protocols 
applied to the cells. 
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An application of recursive whole genome shuffling is the evolution of plant 
cells/and transgenic plants derived from the same, to acquire desirable insecticidal protein 
production properties. The substrates for recombination can be, e.g., whole genomic 
libraries, fractions thereof or focused libraries containing variants of gene(s) known or 
5 suspected to confer tolerance to one of the above agents. Frequently, library fragments are 
obtained from a different species to the plant being evolved. Regardless of the precise 
shuffling methodology used, the selection methods described above for insecticidal protein 
selection, including selection for any of the desirable traits noted herein can be performed. 

The DNA fragments are introduced into plant tissues, cultured plant cells or 

10 plant protoplasts by standard methods including electroporation (From et ah, Proc. Natl. 
Acad Sci. USA 82, 5824 (1985), infection by viral vectors such as cauliflower mosaic virus 
(CaMV) (Hohn et aJ., Molecular Biology of Plant Tumors, (Acai^emic Press, New York, 
1982) pp. 549-560; Howell, US 4,407,956), high velocity ballistic penetration by small 
particles with the nucleic acid either within the matrix of small beads or particles, or on the 

15 surface (Klein et al.. Nature 327, 70-73 (1987)), use of pollen as vector (WO 85/01856), 'or 
use ofAgrobactenum tuniefaciens or A. rhizogenes carrying a T-DNA piasmid in which 
DNA fragments are cloned. The T-DNA piasmid is transmitted to plant cells upon infection 
by Agrobacterium tumefaciens. and a portion is stably integrated into the plant genome 
(Horsch et al., Science 233, 496-498 (1984); Fraley et al., Proc. Natl Acad Sci, USA 80, 

20 4803 (1983)). 

Diversity can also be generated by genetic exchange between plant 
protoplasts. Procedures for formation and fusion of plant protoplasts are described by 
Takahashi et al., US 4,677,066; .Akagi et al., US 5,360,725; Shimamoto et al., Us 5,250,433; 
Cheney etal,, US 5,426,040 

25 After a suitable period of incubation to allow recombination to occur and for 

expression of recombinant genes, the plant cells- are assayed for insecticidal protein, and 
suitable plant cells are collected. Some or all of these plant cells can be subject to a further 
round of recombination and screening. Eventually, plant cells having the required degree of 
insecticidal activity are obtained. 

30 These cells can then be cultured into transgenic plants. Plant regeneration 

from cultured protoplasts is described in Evans et al., "Protoplast Isolation and Culture," 
Handbook of Plant Cell Cultures 1, 124-176 (MacMillan Publishing Co., New York, 1983); 
Davey, "Recent Developments in the Culture and Regeneration of Plant Protoplasts," 
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Protoplasts, (1983) pp. 12-29, (Birkhauser, Basal 1983), Dale, "Protoplast Culture and Plant 
Regeneration of Cereals and Other Recalcitrant Crops," Protoplasts (1983) pp. 3 1-41, 
(Birkhauser, Basel 1 983), Binding, "Regeneration of Plants," Plant Protoplasts, pp. 2 1 -73, 
(CRC Press, Boca Raton, 1985) and other references available to persons' of skill. 
5 Additional details regarding plant regeneration front cells are also found below. 

In a variation of the above method, one or more preliminary rounds of , 
recombination and screening can be performed in bacterial cells according to the same 
general strategy as described for plant ceils. More rapid evolution can be achieved in 
bacterial cells due to their greater growth rate and the greater efficiency with which DNA 
10 can be introduced into such cells. After one or more rounds of recombination/screening, a 
' DNA fragment library is recovered from bacteria and transformed into the plants. The 
.library can either be a complete Hbrary or a focused library, A focused library can be 
produced by amplification from primers specific for plant sequences, panicularly plant 
; sequences known or suspected to have a role in conferring a insect-.resisiance or a related 

15 property, . ' > , 

' Plant genome shuffling allows recursive cycles to be used for. the introduction' 

• and recombination of genes or pathways that confer improved properties to desired plant . 
species. Any plant species, including vveeds and wild cultivars, showing a desired trait:, such 
as insect'resistance, can be used'as the source of DN A that is introduced into the crop or 

20 • horticultural host plant species. 
' , , . ' Genomic DNA prepared from the source plant is fragmented (e.g. by DNasel, 

restriction enzymes, or mechanically) and cloned into a vector suitable for making plant 

• genomic libraries, such as pGA482 (An. G., 1995, Methods MoL Biol. 44:47-58).. This 
vector contains the A. timefaciens left and right borders needed for gene transfer to plant 

25 . cells and antibiotic markers for selection in E. coli, Agrobacterium, and plant cells. A 

multicloriing site is provided for ihsertion of the genomic fragments, A cos sequence is . 
' present for the efficient packaging of DNA into bacteriophage lambda heads for transfection 
' ^ of the primary library into £. coli. The vector accepts DNA fragments of 25-40 kb. ' 

The primary library can also be directly electroporated into an^. tumefaciens 
30 or A, rhizogenes strain that is used to infect and transform host plant cells (Main, GD et al, 
1995, Methods Mol Biol. 44:405-412). Alternatively, DNA can be introduced by 
electroporation or PEG-mediated uptake into protoplasts of the recipient plant species 
(Bilang et al. (1994) Plant Mol. Biol Manual , Kluwer Academic Publishers, Al : 1-16) or by 
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particle bombardment of cells or tissues (Christou, ibid, A2: 1-15). If necessary, antibiotic 
markers in the T-DNA region can be eliminated, as long as. selection for the trait is possible, 
so that the final plant products contain no antibiotic genes, < 

Stably transformed whole cells acquiring the trait are selected on solid or 
5 liquid media. If the trait in question cannot be selected for directly, transformed cells can be 
selected with antibiotics and allowed to form callus or regenerated to whole plants and th en 
screened for the desired property. " 

The second and further cycles consist of isolating genomic-DN A from each 
transgenic line and introducing it into one or more of the other transgenic lines. In each 
10 round, transformed cells are selected or screened, typically in an incremental fashion 

(increasing dosages, etc.). To speed the process of using multiple cycles of transformation, 
plant regeneration can be eliminated until the last round. Callus tissue generated from the 
protoplasts or transformed tissues can serve as a source of genomic DN A and new host cells. 
After the final round, fertile plants are regenerated and the progeny are selected for 
15 homozygosity of the inserted DNAs. Ultimately, a new plant is created that carries multiple 
inserts which additively or synergistically combine to confer high levels of the desired trait. 

In addition, the introduced DNA that confers the desired trait can be traced 
because it is flanked by known sequences in the vector. Either PCR or plasmid rescue is 
used to isolate the sequences and characterize them in more detail. • Long PCR (Foord, OS 
20 and Rose, EA, 1995, PCR Primer. A Laboratory Manual, CSHL Press, pp 63-77) of the full 
25-40 kb insert is achieved with the proper reagents and techniques using as primers the 
T-DNA border sequences. If the vector is modified to contain the £. coli origin of 
replication and an antibiotic marker between the T-DNA borders, a rare cutting restriction 

enzyme, suchas NotLor Sfil, that cuts only at theendsofthe inserted DNA is used to create - 
25 fragments containing the source plant DNA that are then self-ligated and transformed into E 
coli where they replicate as plasmids. The total DNA or subfragment of it that is responsible 
for the transferred trait can be subjected to in vitro evolution by DNA shuffling. The 
shuffled library is then introduced into host plant cells and screened for improvement of the 
trait. In this way, single and multigene traits can be transferred from one species to another 
30 ' and optimized for higher expression or activity leading to whole organism" improvement. 

G. Oligonucleotide and in silico shuffling formats 

In addition to the formats for shuffling noted above, at least two additional 

related formats are useful in the practice of the present invention. The first, referred to as "in 



wo 99/57128 PCT/US99/08473 

32 

silico" shuffling utiUzes computer algorithms to perform virtual shuffling using genetic 
operators in a computer. As applied to the present invention, gene sequence strings 
corresponding to insect resistance are recombined in a computer system and desirable 
products are made, e.g., by reassembly PCR of synthetic oligonucleotides. In silico 
5 shuffling is described in detail in Selifonov and Stemmer in 'METHODS FOR MAKING 
CHARACTER STRINGS, POLYNUCLEOTIDES & POLYPEPTIDES HAVING 
DESIRED CPL^RACTERISTICS" filed 02/05/1999, USSN 60/1 18854. • 

. The second useful format is referred to as "oligonucleotide mediated 
shuffling" in which oligonucleotides corresponding to a family of related homologous 

10 nucleic acids (e.g., as applied to the present invention, interspecific or allelic variants of a 
insect resistance nucleic acid) are recombined to produce selectable nucleic acids. ' This 
format is described in detail in Cramer i ei al, "OLl GONUC LEOTIDE MEDIATED ' 
' NUCLEIC ACID RECOMBINATION" fiied Febrv.arv x 1 999: I SSN 60/11 8,8 13. 

In brief, .a family of homoiogous nii^^ric acidsequences are first aligned, e.g. ■ 

15 using available computer software to select regions of '^(kinhy/ Sin -Harity and regions of 
diversity. ' A plurality (e.g.., 2, 5, 10, 20, 50, 75. or LOO or niore) oligonucleotides 
corresponding to at least one region' of diversity are synthesized. These oligonucleotides can 
be shuffled directly, or can be recombined with one or more of the family of nucleic acids. 
There are several procedures available for shuffling homologous nucleic acids, such as by " 

20 digesting the nucleic acids with a DNase, permitting recombination to occur and then ' 
regenerating full-length templates, i.e., as described in Stemmer (1998) DNA 
MUTAGENESIS BY RANDOM FRAGMENTATION AND REASSEMBLY U.S. Patent 
5,830,72 1) Thus, in one embodiment, a fijU-length nucleic acid which is identical to, or 
homologous with, at least one of the homologous nucleic acids is provided, cleaved with a 

25 DNase, and the resulting set of nucleic acid fragments are recombined with the plurality of 
family gene shuffling oligonucleotides. 

* Libraries of family gene shuffling oligonucleotides are'also provided by 
oligonucleotide shuffling. For example, homologous genes of interest are aligned using a 
sequence alignment program such as BLAST; as described above Nucleotides 

30 corresponding to aniino acid variations between the homologs are noted. Oligos for 

synthetic gene shuffling are designed which comprise one (or more) nucleotide difference to 
any of the aligned homologous sequences, i.e., oligos are designed that are identical to a first 
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nucleic acid, but which incorporate a residue at a position which corresponds to a residue of 
a nucleic acids homologous,, but not identical to the first nucleic acid. 

Typically, some or all of the oligonucleotides of a selected length (e.g., about 
20, 30, 40, 50, 60, 70, 80, 90, or 100 or more nucleotides) which incorporate all possible 
5 nucleic acid, variants are made. This includes X oligonucleotides per X sequence variations, 
where X is the number of different sequences at a locus. The X oligonucle otides are largely 
identical in sequence, except for the nucleotide(s) representing the variant nucleotide(s). 
Because of this similarity, it can be advantageous to utilize parallel or pooled synthesis 
strategies in which a single synthesis reaction or set of reagents is used to make common 

10 portions of each oligonucleotide. This can be performed e.g., by well-known solid-phase 

nucleic acid synthesis techniques, or utilizing array-based oligonucleotide synthetic methods. 

Most preferably, the oligonucleotides have at least about 10 bases of 
sequence identity to either side of a region of variance to ensure reasonably efficient 
recombination. However, flanking regions with identical bases can have fewer identical 

15 bases (e.g., 5, 6, 7, 8, or 9) and can, of course, have larger regions of identity (e.g., 11, 12, 

13, 14, 15, 16, 1,7 1,8 ,19,20,25,30, 50, or more), t 
During gene assembly, oligonucleotides can be incubated together and -'"'l 
reassembled using any of a variety of polymerase-mediated reassembly methods, e.g., as 
described herein and as known to one of skill. Selected oligonucleotides can be "spiked" in - 

20 ' the recombination mixture at any selected concentration, thus causing preferential - 
incorporation of desirable modifications. 

HI SUBSTRATES FOR EVOLUTION OF OPTLMIZED GENES USEFUL IN CROP 
PLANTS 

The invention provides methods of obtaining pest resistance genes that are ' 
25 enhanced in their ability to confer upon plants resistance to pests. The methods involve the 
use of DNA shuffling to develop libraries of recombinant pest resistance genes, and the 
screening of these libraries to identify those recombinant genes that exhibit the desired 
improved properties. The methods are applicable to any nucleic acid that, when present in a 
plant, or on a plant, can confer resistance upon a pest. Several examples of such nucleic 
30 acids are discussed herein; these and. others are described in, for example, Advances in insect 
control: The role of transgenic plants, Carozzi and Koziel, eds., Taylor & Francis, New 
York, 1997. Also provided are methods of obtaining other genes that are optimized for their 
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ability to confer a beneficial effect upon plants. These genes' include, for example, genes 
involved in herbicide selectivity and in nitrogen fixation. 
A. Bacillus Toxins and Related Polypeptides 

The invention' provides methods of obtaining optimized recombinant Bt 
toxins. Certain species of the gram-positive soil bacterium Bacillus produce proteins that are 
toxic to insects, arachnids, and nematodes. These proteins include the crystal proteins, 
known as '"Bt toxins," that are produced by B. thurmgiensis and other Bacillus species. Bt 
toxins are typically polypeptides of about 130 kDato 140 kDa or of about 70 kDa, which 
contain toxic fragments of 60 +/- 10 kDa(Hofte and Whiteley (1989) Microbiol Rev. 53: 
242-255). Bt toxins are highly specific and lack toxicity towards humans and other animals, 
and plants Bt toxins are reviewed in, for example, Kumar e('al. (1996) Ach\ Appl 
Microbiol. 42: 1-43 and Peferoen (1996) \x\Ad\>ames in Insect Control, supra.. Chapter 2, 
pp. 21-48. 

Bt toxins produced by different Bacillus species can be classified on the basis 
similarity of the nucleic acid and amino acid sequences, and also, based on the pests against 
which the toxins are effective (Hofte and Whiteley / .s^2//?ra. ; Ogiwara et ai (1 995) Curr. 
Microbiol. 30: 227-235). Insecticidal Bt toxins, for example, are active against one or more 
of the Lepidoptera, Diptera, Coleoptera, or Phthiraptera (Kumar et ai , supra.). Bt genes 
. have been classified into at least six major classes: cry^I (Lepidoptera specific), cry^II 
(Lepidoptera and Diptera specific), crylll (Coleoptera specific), crv/f-" (Diptera specific), . 
cryV, and cryVI (Hofte and Whiteley, siipra. \ Feitelson ei al (1992) Biotechnol. 10: 271- 
276). Subgroups have also been proposed based on differences in insecticidal spectra, such 
as cryIC, cryIIA, and cryllB (Kumar et al.., ^iipra.). .Another classification is based on amino 
acid identity of ftill-length product s of Bt toxin genes (Crickmore er al. (1996) Genes 
Microbiol. Kumar er ^/., supra). According to this scheme, Bt toxins are divided into 
several homology groups, with Cry 1, r3, -4, -7, -8, -9, and -10 forming the largest group, , 
Cry2, Cryl 1, and Cryl 8 forming the second group, Cry5, -12, -13, and -14 the third group, 
and the Cyt proteins the fourth group. Cry6, -1 5, and -16 are unique proteins under this 
classification scheme. Classification of Bt crystal protein genes, including dendograms 
showing evolutionary relationships, is also described in Yamamoto and Powell (1993) In 
Advanced Engineered Pesticides, Kim, Ed., Marcel Dekker, pp. 3-42. 
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The methods of the invention involve performing DNA shuffling using 
nucleic acids that encode Bt toxins as the substrates. Numerous nucleic acid sequences that 
encode Bt toxins have been characterized. See, e.g., US Patent Nos. 5,683,691, 5,633,446, 
5,651,965, 5,635,480, 4,766,203, 4,448,885, 4,467,036, 4,797,276, 4,853,331, 4,918,006, 
4,849,217, 5,151,363, 4,948,734, and 4,771,131; and European patent publications 
J,J49,i62,^,2i3,8J8_andJ.93259..Many_additiGnal^Bt4Gxi^ 
and other databases. At least some Bt toxins are encoded by plasmid-bome genes (Stahly et 
al. (197S) Biochem. Biophys. Res. Commun. 84: 581-588; Debaboc eial. (1977) Genetika 
13; 496-501. 

Libraries of the recombinant Bt toxin genes are prepared by DNA shuffling; 
In preferred embodiments, the substrates for DNA shuffling are derived from Bt toxin 
families. Figure 2 provides a dendogram showing relationships among many Bt toxin genes. 
A list of Bt holotype toxins, together with database accession numbers, is provided in Table 
J . A Hst of these and other Bt toxin genes is provided in Table 2. 

Table I : List of Bacillus thuriugiensis Holotype Toxins 



Name 


Old ; Acc iNum 
Name [ 


crylAa 


crylA(a) j M 11250 


crylAb 


crylA(b) 


M13898 


crylAc 


crylA(c) 


Ml 1068 


cry 1 Ad 


crylA(d) 


M73250 


cryl Ae 


crylA(e) 


M65252 


cryl Af 


icp 


U 82003 


cry lAg 




AF081248 


crylBa 


cryB 


X06711 


crylBb ' 


ET5 


L32020 


crylBc 


PEGS 


Z46442 


crylBd 


cryEl 


U70726 


crylCa 


crylC 


.X07518 


crylCb 


cryIC(b) 


M97880 


cryl Da 


crylD 


X54160 


crylDb 


PrtB 


,222511 


crylEa 


crylE 


X53985 


cryl be 


crylE(b) 


M73253 


cryl Fa 


crylF 


M63897 


crylFb 


PrtD 


Z22512 


crylGa 


PrtA 


Z22510 


cry 1Gb 


cryH2 


U70725 
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Table 1 (con't) • 



XI <tlllv ^ 


Old 

Name 


Acc Num 


rrvl T~Tn 


PrtC 


Z22513 


prv 1 Hb 
y 11 iL/ 




U35780 


L/J y 1 la 


cryV 


X62821 


CI y 1 iv 




AF056933 


prvl rh 
v/i y * 


CryV465 


U07642 


f*r\7 1 Til 
V/i y 1 J a 


ET4 


L32019 


r»T*\/VTH 

01 y 1 J 


ETl 


U31527 


cryl Jc ; ' 




190730 


crylKa 




U28801 


cry2Aa 


cryllA 


M31738 


cry2 Ab 


cryllB 


M23724 


cry2Ac 


cryllC 


X57252 


cry3Aa 


crylllA 


M22472 


cry3Ba 


CryIIIB2 


X17123 1 


cry3Bb 1 cryllIBb 


M89794 


cry3Ca UrylllD 


X59797 


cry4Aa 


cryi V /\ 


Y00423 




cr>'IV^ 


X07423 


crySAa 


cryVA(a) 


L07025 


crySAb 


cryVA(b) 


L07026 


cry 5 Ac 




134543 


crySBa 




U 19725 


cry6Aa 


cry VIA 


L07022 


cry6Ba 


cryVIB 


L07024 


cry7Aa 


crylllC 


M64478 


cry7Ab 


crylllCb 


U04367 


crySAa 


crylllE 


U04364 


crySBa 


crylllG 


U04365 


crySCa 


crylllF 


U04366 ' 


cry9Aa 


crylG 


X58120 


cry9Ba 


crylX 


X75019 


cry9Ca 


crylH 


Z37527 


cry9Da 




D85560 


cry9Ea 




AB011496 


crylOAa 


crylVC 


M12662 


cryllAa 


crylVD 


M31737 


cryllBa 


JegSO 


X86902 


cryllBb 




AF017416 


cryl2Aa 


cryVB 


L07027 


crylSAa 


cryVC 


L07023 


cryl4Aa 


cryVD 


U13955 


crylSAa 


34kDa 


M76442 


cryl6Aa 


cbm71 


X94146 
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Table 1 (con't) 



? 4 31116 


Name 


Acc Num 


prv 1 7 A3 


CUITi /Z 


YQ04.7S 


pr\; 1 S A n 
y 1 Or\a 


rr^/RP 1 

cryDx^ 1 




rr^/ 1 Q A Q 

vi y 1 yr\<i 


J ego J 


1 U / OUJ 


cryzu/\d. 




UoZ3 1 0 






-UZVOZ 








C/i y^^ J 






CryZHrVa 


jeg II 


T TOO 1 00 

Uoo loo 


or\/'? ^ A Q 


jeg/4 


Uoo 1 oy 


rrxrO ^ A q 

CI y^u/\o 




/Vr izzoy / 


crv27 Aa 




. /TuL> V/^ J ^ J' J 


cry28Aa 




AF 132928 


cytlAa 


cytA 


X03182 


cytlAb 


cytM 


X98793 


cytlBa 




U37196 


cyt2Aa 


cvtB 


Z14147 


cyt2Ba 


cvtB 


U52043 


cyt2Bb 




U82519 



Table 2: Bt toxin genes 



Name 


Acc. No. 


Reference 


Year i Journal 


Coding 


crylAal 


Ml 1250 


Schnepf ei al 


1985 


JBC 260: 6264-6272 


527-4054 


cryl Aa2 


M10917 


Shibano et al. 


1985 


Gene 34: 243-251 


153-2955 


crylAa3 


D00348 


Shimizu et al. 


1988 


ABC 52: 1565-1573 


73-3600' 


crylAa4 


X13535 


Masson et al. 


1989 


NAR 17: 446-446 


1-3528 ■ 


crylAaS 


D17518 


Udayasuriyan et al. 


1994 


BBB 58:830-835 


81-3608 


cryl Aa6 


U43605 


Masson et al 


1994 


Mol Micro 14:851- 
860 


1-1860 


cryl Abl 


M13898 


Wabiko et al. 


1986 


DNA 5:305-314 


142-3606 


crylAb2 


M12661 


Thome et al. 


1986 


J. Bact 166:801-811 


155-3622 


crylAb3 


Ml-5271 ■ 


Geiser et al: 


1986 " 


Gene 48:109-118 


156-3620 


crylAb4 


D00117 


Kondo et al. 


1987 


ABC 51:455-463 


163-3627 


crylAbS 


X04698 


Hofte et al. 


1986 


EJB 161:273-280 


141-3605 


crylAb6 


M37263' 


Hefforde/a/. 


1987 


J. Biotech 6:307-322 


73-3537 


crylAb7 


X13233 


Haider& EUar 


1988 


NAR 16:10927- 
10927 


1-3465 


crylAbS 


Ml 6463 


Oeda et al. 


1987 


Gene 53:113-119 


157-3,624 


crylAb9 


X54939 


Chak& Jen 


1993 


PNSCRC 17:7-14 


73-3540 


crylAblO 


A29125 


Fischholf et al. 


1987 


Bio/technology 
5:807-813 


peptide 
seq 


crylAcl 


Ml 1068 


Adang et al. 


1985 


Gene 36:289-300 


388-3921 


crylAc2 


M35524 


Von Tersch et al. 


1991 


AEM 57:349-358 


239-3769 


crylAc3 


X54159 


Dardenne et al. 


1990 


NAR: 18:5546-5546 


339-2192 
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Table 2 (con't) 



Name 




Reference 


Year 


Journal 


Coding 


cry 1 /\c^ 


ivi / D^'^y 


Pavne et ul 


1991 


USP 4990332 


1-3534 


cry 1 




Pavne et al 


1992 


USP 5135867 


1-3531 


cry 1 Ac6 




\iasson et al 


1994 


Mol. Micro. 14: 851- 
860- 


1-1821 


L/iy 1 rVL- / 


U87793 


Herrera et al. 


1994 


AEM 60:682-690 . 


976-4509 


crylAcS 


U87397 


Omolo et al. 


1997 


Curr. Micro. 34:118- 
121 


153-3686 


cry I r\.Ly 


ITS0872 

W O 7 O / 


Gleave et al. 


1992 


NZJCHS 20:27-36 


388-3921 


cry 1 r\\^ I V 


A 1002514 


Sun and Yu 


1997 , 


unpublished 


388-3921 


cry 1 r\u 1 


M7'^9^0 

IVl / J ^J\J 


Pavne & Sick 


1993 


USP 5246852 


1-3537 


cry 1 /vc 1 


M65252 


Lee & Aronson 


1991 


J Bact 173:6635-6638 


81-3623 


cry I rvi i 


U82003 


Kange/a/. 


1997 


unpublished 


172-2905 


cry ijjdi 


X06711 


Brizzard & 
Whiteley 


1988 


NAR 16:2723-2724 


1-3684 


rrvl Ra? 

Y 1 


X95704 


Soetaert 


1996 


unpublished 


186-3869 


rrvlRKl 

Y 1 -DU 1 


L32020 


Donovan et al. 


1994 


USP 5322687 


67-3753 


rrvl Rrl 


Z46442 


Bishop el al. 


1994 


unpublished 


141-3839 


y 1 i->u 1 


U70726 Chak 


1996 


unpublished 


842-4534 


cry i\^di 


X07518 lHoneet;/a/. 


1988 


NAR 16:6240-6240 


47-3613. 


cry 1 V^a-i 


XI 3620 Sanchisc'/a/. 


1989 


Mo! Micro 3:229-238 


241-2711 


j cryiL/dj 


M73251 1 Payne & Sick 


1993 


USP 5246852 


1-3570 


j crylCa4 


A27642 


Van Mellaert et al. 


1990 


EP 0400246" 


234-3800 


cry i L/d J 


X96682 


Strizhov 


1996 


unpublished 


1-2268 


cry 1 v_.aD 


X96683 


Strizhov 


1996. 


unpublished 


1-2268 


cry 1 v>a / 


X96684 


Strizhov 


1996 


unpublished 


1-2286 


r»r\7l Phi 

cry J v-D 1 


M97880 


Kalman et al. 


1993 


AEM 59:1131-1137 


296-3823 


cry lUai 


X54160 


Hofte et al. 


1990 


NAR 18:5545-5545 


264-3758 


cry ii>^u i 


Z225H 


Lambert 


1993 


unpublished 


241-3720 


cry 1 jDa I 


X53985 


Visser etal. 


1990 


JBact 172: 6783- 
6788 


130-3642 


cry 1 ooz 


X56144 


Bosse et al. 


1990 


NAR 18:7443-7443 


1-3513 


cry 1X1 dJ 


M73252 


Payne & Sick 


1991 


USP-5039523 


1-3513 


crylEa4 


U94323 


Ibarra et al. 


1997 


unpublished 


388-3900 


cryiiiDi 


M73253 


Payne & Sick 


1993 


USP 5206166 


1-3522 


cry irai 


M63897 


Chambers et al. 


1991 


JBact 173:3966-3976 


478-3999 


cry 1 r az- 


M73254 


Payne & Sick 


. 1993 


USP 5188960 


1-3525 


crylFbl 


Z22512 


Lambert 


1993 


unpublished 


483-4004 


crylFb2 


AB012288 


Masuda & Asano 


1998 


unpublished 


84-3587 


cry I Gal - 


Z22510 


Lambert 


. 1993 


unpublished 


67-3564 


crylGa2 


Y09326 


Shevelev et al. 


. 1997 


Febs Lett 404: 148- 
152 


692-4210 


crylGbl 


U70725 


Chak 


1996 


unpublished 


532-4038 


crylHal 


Z22513 


Lambert 


1993 


unpublished 


530-4045 


1 crylHbl 


U35780 


Koo et al. 


1995 


unpublished 


~ 728-4195 
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Table 2 (con't) 



Name 


Ace, No. 


Reference 


Year | Journal 


Coding 


cryllal 


X62821 


Tailor et al. 


1992 1 Mol Micro 6: 1211- 
1 1217 


355-2511 


crylla2 


M98544 


Gleave et al. 


1993 


AEM 59:1683-1687 


1-2160 


crylla3 


L36338 


Shin al. 


1995 


AEM 61:2402-2407 


279-2435 


crylla4 


L49391 


Kostichka ei al. 


1996 


JBact 178:2141-2144 


61-2217 


cryllaS 


Y08920 


Selvapandiyan 


1996 


unpublished 


■524-2680 


cryllbl 


U07642 


Shin ei al. 


1995 


AEM 61:2402-2407. 


237-2393 


crylJal 


L32019, 


Donovan et al. 


1994 


USP 5322687 


99-3519 


crylJbl 


U31527 


Von Tersch & 
Gonzalez 


1994 


USP 5356623 


177-3686 


crylKal 


U28801 


Koo etal. 


1995 


FEMS 134:159-164 


451-4098 


cry2Aal 


M31738 


Donovan et al. 


1989 


JBC 264:4740-4740 


156-2054 


cry2Aa2 


M23723 


Widner & Whiteley 


1989 


JBact 171:965-974 


1840-3738 i 


cry2Aa3 


D86064 


Sasaki et al. 


1997 


Curr Micro 35:1-8 • 


2007-3911 1 


cry2Aa4 


AF047038 


Misrat;/a/. 1 1998 


unpublished 


.10-1909 i 


cry2Abl 


•M23724 


Widner & Whiteley \ 1989 


JBact 171:965-974 


1-1899 1 


cry2Ab2 


X55416 


Dankocsik eial. j 1990 


Mol Micro 4: 2087- 
2094 


874-2775 1 

. ■ 1 


cr>^2Acl 


X57252 


Wu etaL ; 1991 


FEMS 81:31-36 


2125-3990 i 


cr>'3Aal 


M22472 


Hermstadt 6V a/. 1987 


Gene 57:37-46 


25-1956 


cry3Aa2 


J02978 


Sekar etal. M987 


PNAS 84:7036-7040 


241-2175 


cry3Aa3 


Y00420 


Hoftee/a/. ; 1987 


NAR 15:7183-7183 


566-2497 


cry3Aa4 


M30503 


McPherson a/. \ 1988 

i 


Bio/technology 6:61- 
66 


201-2135 


cry3Aa5 


M37207 


Donovan etal. \ 1988 


MGG 214:365-372 


569-2500 


cryllcl 


AF056933 


Osman et al. • 1998 


unpublished 


.1-2180 


cry3Aa6 


U10985^ 


Adams et al. 


1994 


Mol Micro 14:381- 
389 


569-2500 


cry3Bal 


X17123 


Sick et al. 


1990 


NAR 18:1305-1305 


25-1977 


cry3Ba2 


A07234 


Peferoen etal. 


1990 


EP 0382990 


342-2297 


cry3Bbl 


M89794 


Donovan et al. 


.1992 


AEM 58:3921-3927 


202-2157 


cry3Bb2 


U31633 


Donovan et al. 


1995 


USP 5378625 


144-2099 


cry3Cal 


X59797 


Lambert etal. 


1992 


Gene 110:131-132 


232-2178 


cry4Aal 


Y00423 


Ward & EUar 


1987 


NAR 15:7195-7195 


1O540 


cry4Aa2 . 


D00248 


Sen et al. 


1988 


ABC 52:873-878 


393-3935 


cry4Bal 


X07423 


Chungjatpomchai . 
et al 


1988 


EJB 173:9-16 


157-3564 


cry4Ba2 


X07082 


Tungpradubkul et 
al. . 


1988 


NAR 16:1637-1638 


151-3558 


cry4Ba3 


M20242 


Yamamoto et al. 


1988 


Gene 66: 107-120 


526-3930 


cry4Ba4 


D00247 


Sen et al. 


1988 


ABC 52:873-878 


461-3865 


crySAal 


L07025 


Sick el al. 


1994 


USP 5281530 


1-4155 


crySAbl 


L07026 


Narva et al. 


1991 


EP 0462721 


1-3867 


cry 5 Ac 1 


134543 


Payne et al. 


1997 


USP 5596071 


1-3660 



•wo 99/57128 



40 



PCTAJS99/08473 



Table 2 (con't) 



Name 


Acc. No. . 


Reference 


Year 


Journal 


Coding 


crySBal 


U19725 


Payne et al. 


1997 


ySP 5596G71 


1-3735 


cry6Aal 


L07022 


Narva et al. 


1993 


USP 5236843 


1-1425 


cry6B'al 


L07024 


Narva et al. 


1991 


EP 0462721 ' 


1-1185 • 


ery7Aal 


M64478 


Lambert et al. 


1992 


• AEM 58:2536-2542 


184-3597 


cryVAbl 


U04367. 


Payne & Fu 


1994 


USP 5286486 


1-3.414 


cry7Ab2 


U04368 


Payne & Fu 


1994 


USP 5286486 


1-3414 


crySAal 


U04364 


Foncerrada et al. 


1992 


EP 0498537 


1-3471 


crySBal 


U04365 


Michaels et al. 


1993 


WO 93/1 5206 . 


1-3507 ■ 


.cry8Cal 


U04366 


Ogiwara et al. 


1995 


Curr Micro 30:227- 
235 ; 


1-3447 


cry9Aal X58120 


S'mulevitch et al. 


1991 


FEBS 293:25-28 


5807-9274 


cry9Aa2 


X58534 


Gleave et al. 


1992 


JGM 138:55-62 


385-3837'. 


cry9Bal 


X75019 


Shevelev et al. 


1993 


FEBS 336:79-82 


26-3488 


cry9.Cal ■ 


Z37527 


Lambert et al 


1996 


AEM 62:80-86 


2096-5569 


cry9Dal 


D85560 \Asano etal. [1997 


AEM 63:1054-1057 


47-3553 


cry9Da2 


AF042733 ! Wasano & Ohba 


1998 


unpublished 


<1-1937 


cry9Eal 


AB011496 [ Midoh and Oyama 


1998 i 


unpublished 


211-3663 


crylOAal , 


Ml 2662 \ Thome etal. 


1985 


JBact 166:801-811 


941-2965 


cryllAal 


M3 1 73 7 j Donovan et al. 


1988 


JBact 170:4732-4738 


41-1969 


cryl l Aa2 


M22860 ! Adams etal. 


1989 ■ 


JBact 171:521-530 


<l-235 


cryllBal 


X86902 i Delecluse . 


■1995 


AEM 61:4230-4235 


■64-2238 


cryllBbl. 


AF017416 


Orduz et al. ■ 


■ 1998 


Biochem. Biophys. 
Acta 1388: 267-272 


97-2349 


cryl2Aal 


L07027 


Narv-a et al. 


1991 


EP 0462721 


1-3771 


crylSAal 


L07023 


Narva et al. 


1992 


WO 92/19739 


1-2409 •. 


'cryMAal 


U13955 


Narva et al. 


1994 


WO 94/16079 


1-3558 


crylSAal 


M76442 


Brown & Whiteley 


1992 


JBact 174:549-557 ' 


1036-2055 


cryl 6 Aal 


X94146 


Barloy et al. 


1996 


JBact 178:3099-3105 


158-1996 


cryl7Aal 


X99478 


Barley et al. 


1997 


unpublished 


12-1865 


crylSAal 


X99049 


Zhang et al. 


1997 


JBact 179:4336-4341. 


743-2860 


cryl9Aal 


Y07603 


Rosso and 
Delecluse 


1996 


AEM 63:4449-4455 


719-2662 


eryl9Bal 


D88381 






unpublished ' 




c'ry20Aal 


U82518. 


Lee, & Gill , 


1997 • 


AEM 63:4664-4670 


60-2318 


cry21Aal 


132932 • 


Payne e/ a/ , 


.1996 


USP 5589382 


1-3501 


cry22Aal 


134547 


Payne et al. 


1997 


USP 5596071 


1-2169 


cytlAal 


X03182 


Waalwijk et al. 


1985 


NAR 13:8207-8217 


140-886 


cytlAa2 


X04338 


Ward&EUar 


1986 


JMB 191:1-11 


509-1255 


cytlAaS 


Y00135 


Earp & EUar 


1987 


NAR 15:3619-3619 


36-782 


cry24Aal 


U88188 


Kawalek 


1998 


unpublished 


l->2024 


cry25Aal 


U88189 


Kawalek 


1998 


unpublished 


1-2028 


cry26Aa 


AF122897 


Wojctechowska et 
al. 


1999 


unpublished 


897-4388 
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Table 2 (con't) 



Name 


Acc. No. 


Reference 


Year 


Journal 


Coding 




cry28Aal 


AF132928 


Wojctechowska et 
al. 


1999 


unpublished 


1129-4458 




cytlAa4 


M35968 


Galjart e/.a/. 


1987 


Curr Micro 16:171- 
177 


67-816 




cytlAbl 


X98793 


Thiery et al. 


1997 


AEM 63:468-473 


28-777 




cytlBal 


U37196 


Payne et al. 


1995 


USP 5436002 


1-795 


% 


cyt2Aal 


Z14147 


Koni&EUar 


1993 


JMB -229:3 19-327 


270-1046 


cyt2Bal 


U52043 


GuerchicofF el al. 


1997 


AEM 63:2716-2721 


287-655 


cyt2Ba2 


AF020789 


Guerchicoff et al. 


1997 


AEM 63:2716-2721 


<l->469 


cyt2Ba3 


AF022884 


Guerchicoff et al. 


1997 


AEM 63:2716-2721 


<l->469 


cyt2Ba4 


AF022885 


GuerchicofF et al. 


1997 


AEM 63:2716-2721 


<l->469 


cyt2Ba5 


AF022886 


GuerchicofF e/ or/. 


1997 


AEM 63:2716-2721 


<1->471 


cyt2Ba6 


AF034926 


GuerchicofF et al 


1997 


AEM 63:2716-2721 


<l->472 


cyt2Bbl 


U825I9 


Cheong & Gill 


1997 


AEM 63:3254-3260 


416-1204 


401cDa 


M76442 


Brown and 
Whiteley 


1992 

i" 


JBact 174: 549-557 j 45-971 

i 




X92691 


Juarez-Perez t;/ a/. 11993 [unpublished j 1-981 


cry TDK 


D86346 j Hashimoto i 1996 j unpublished i 177-2645 


cryC53 


X98616 |. Juarez-Perez er a/, |1996 | unpublished 11-1005 


vip3A(a) 


L48811 j Estruch e/ a/. i 


1996 j PNAS 93: 5389-5394 


739-3105 


vip3A(b) 


.L48812 j Estruch e/ a/. 


1996 ! PNAS 93: 5389-5394 


118-2484 
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Expression of the shuffled genes can be achieved in E. coli or any bacilli by 
^ using an appropriate expression vector Most, if not all, Bt toxin promoters associated with 
cry genes will function in £. coli as well as bacilli. An example of a suitable vector for use in 
5 E. coli host cells is described in Sasaki et ai (1996) Curr. Microbiol. 31: 195-200). For high 
expression in E. coli, a portion of the cry promoter between Apal and Ndel sites is removed 
from the vector described by Sasaki et al. In presently preferred embodiments, the vector 
also includes coding sequences that, when linked in frame to the coding sequence of the 
shuffled gene, encode an easily detectable and/or immobilizable tag (e.g., multiple His 
10 residues). 

The cry gene can be truncated to produce a pre-activated Cry protein. It was 
found in a number of cases that the truncated gene produces a protein that is substantially 
toxic XoE. coli. In preferred embodiments, however, the truncated cry gene is expressed in a 
bacillus {e.g.. Bacillus cereus or 5. thuringiensis). A leader sequence can be added to the cry 



^ . wo 99/57128 PCT/US99/08473 

gene so. that the protein is secreted into the cuhure medium. This approach makes the protein 
isolation process less time consuming. ' , 

■ Those recombinant genes that encode Bt toxins having improvements in one 
. or more desired' properties are identified as described herein: Screening methodologies for 

'5 ' some of these properties are described in Kumar e/ a/., 5-z//?ra. 

■ The optimized recombinant Bt toxin genes can be used for the production of 
pesticidal proteins for direct application to plants, can be, expressed in microorganisms that 
colonize plants^ or can be introduced into transgenic plants. Bt genes Have been expressed in 
at least twenty-six, different plant species (Schuler 6^ra/. ( 1998) 7/'i 168-175). Each 

10 of these modes of administration are discussed in more detail below. : , 

B. Protease and a-Amylase Inhibitors . ^ 

Additional pest resistance genes that can be optimized using the methods of 
the; invention are those that encode protease inhibitors. Protease inhibitors can inhibit insect 
development (for review, ^ee, c:g. , Reeck et a/. ('1997) ln Advances in Pest ConiroL supra. 

15 Chapter 10, pp. 157-183; Ryan (1990) Annu, Rev, Phytopathol. 28: 425-449) and often can 
kill insects and nematodes {see, Jongsma (1997) J. bisect' Physiol. 43: 885-895). Protease 
inhibitors found in plant tissues are considered to be a part of plant defense mechanism 
against insect and nematode attack. A problem with the protease inhibitors for insect control 
is that insects can become resistant to the inhibitor (Jongsma and Boker (1997) J. Insect. . 

20 Physiol 43, 885-895) described that insects change the composition of proteases in the 
digestive tract when an inhibitor is fed. It is very important ,to find/produce an inhibitor 
which inhibits a wide variety of insect proteases. In this example, we shall attempt to ■ 
improve a plant cysteine inhibitor by DNA shuffling: . . 

Protease inhibitor genes that are useful for shuffling include, those from ail 

25 , biological sources, including plants, animals, and microorganisms. Several nonhomologous. 

• , families of protease irihibitors are known (Lasko\yski a^^^^ 

593-626), including at least ten families in plants (soybean trypsin inhibitor (Kuhitz), 
Bowman-Birk inhibitor, potato inhibitor I, potato inhibitor II, squash inhibitor, Ragi 1-2/ 
maize bifunctional inhibitor, carboxypeptidase A, B inhibitor, cysteine proteinase inhibitor 

30 (cystatins), aspartyl proteinase inhibitor, and barley trypsin inhibitor)(5ee, e.g., Ignacimuthu, 
In Biotechnological perspectives in chemical ecology of insects, T, Ananthakrishnan, ed., 
Science Publishers, Inc., pp. 277-283). Inhibitor families are known for each of the four 



wo 99/57128 _ PCT/US99/08473 

43 

mechanistic classes of proteolytic enzymes (serine, cysteine, aspartic, and metallo-proteases) 
(Ryan, supra). Sequences of cysteine protease inhibitors are described in, for example, 
Reddy e/a/. (1975)7. Bioi Chem. 250; 1741-1750 and AbeeM/. (1987) J. BioL Chem. 262: 
16793-16797. Serine protease mhibitors are described in, for example, US Patent No. 
5 5,151,509. 

— — ^Nucleic-acids-that^encode-a-amylase inhibitorsrsome-ofwhich-are-alsp" — — 

•bifunctional as protease inhibitors, are also suitable candidates for optimization using the 
DNA shuffling methods of the invention. Many of the a-amylase jnhibitors exhibit amino 
acid similarity to four of the protease inhibitor families of plants {i.e., the Kunitz, Bariey, 

10 Bowman-Birk and the Ragi/Maize biftinctional inhibitor families {sec, e.g., Ryan ei aL, 
supra.). Sequences of Ragi a-amylase/protease inhibitors are described in, for example, 
Shivaraj et ai {\9%\) Biochem. J, 193: 29-36 and Svendsen etal. (1986) Carlsberg Res. 
Commtw. 5 1 : 43-50. See also, Schuler et ai, supra. 

Protease inhibitors of plant origin that have been engineered into othe.r plant 

15 species are reviewed in, for example, Schuler e/^^/. (1998) Tihtech 16: 168-175, Hiider etal. 
(1993) In Transgenic Plants, Vol. K Kung and Wu, Eds., Academic Press, pp. 317-338. 
Transgenic plants that carrv* a Mauduca sexta protease inhibitor are described in US Patent 
No. 5,436,392. Nematode control using protease inhibitors is described in US Patent No. 
5,494,813. 

20 To identify recombinant genes that, encode protease inhibitors having 

improved properties for use as pest resistance genes in plants, one can use assays such as 
those described herein: One suitable assay involves expressing the library- of recombinant 
genes by phage display, after which panning is employed using a protease substrate. See, 
e.g., Jongsma etal. {1995) Molecular Breeding 1: 18M9L 

25 C. Cholesterol Oxidase ' 

Genes encoding polyphenol oxidases, including cholesterol oxidases, are 
another suitable substrate for use in the methods of the invention. Cholesterol oxidases are 
described in, for example, Shen et al. (1997) Arch Insect Biochem. Physiol. 34: 429-442 and 
Purcell (1997) In Advances in Insect Control, Chapter 6, pp. 95-108, US Patent Nos. 

30 5,665,560,. and 5,602,017, and PCT application WO9425603, Genbank Accession Nos. 
164550, E07692, E07691, E03850, E03828, E03827, U13981, and D00712. 
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examples of natural plant and microbial products that are insecticidal. The genes involved in 

the biosynthesis of these products can be shuffled to increase the compound yield. The 

number of genes involved in the biosynthetic pathways specifying various natural products 

vary depending on the nature of the product. DNA shuffling can be applied to the entire set 

5 of genes coding for enzymes of a biochemical pathway for production of these natural 

/ ■ ■ • ' 

products. As a result, many of these products can be produced at much higher concentrations 

either in a fermentor (for microorganisms) or in plania. In other embodiments, the shuffled 

genes are selected for other improved properties, including, for example, increased toxicity 

and/or host range. These shuffled genes can be introduced />7 planta for in plant protection 

10 from insects. 

G. Baculoviruses 

Also suitable for use as substrates for DNA shuffling to generate recombinant 
nucleic acids which confer pest resistance are genes and genomes derived from insecticidal , 
viruses, including baculoviruses. The use of bacuioviruses as insecticides,. as well as the 

15 identification of baculovirus genes that encode insecticidal proteins, is described in. for 

example, US Patent No. 5,662,897, see also. Miller, L. K. (1 98 1) in Genetic Engineering in 
the Plant Sciences, Panopoulous (ed ), Praeger Publ., New York, pp. 203-224; Carstens, 
(1980) Trends Biochem. Sci. 52:107-1 10; Harrap and Payne (1979) mAdvances in Virus 
Research^ Vol 25, Lawfer et al (eds.). Academic Press, New York, pp, 273-355; The 

20* Biology of Baculoviruses^ Vol, I and II, Granados and Federici'(eds.), CRC Press, Boca . - 
Raton, Fla., 1986.). 

The DNA shuffling and screening methods of the invention are useful for 
obtaining insecticidal viruses that have improved properties including, but not limited to, 
increased stability (including UV stability), greater infectivity and host range, greater 

25 virulence, and reduced time to kill a pest. The length of time between baculovirus ingestion 

and insect death can sometimes limit the efficacy of baculoviruses as pesticides, as the insect . . 
can continue to feed and damage crops during the time between application of the pesticide 
and insect death. By use of DNA shuffling and screening as described herein, one can obtain 
baculoviruses that are capable of killing the insects more quickly than naturally-occurring 

30 baculoviruses. Bioassays for determining the virulence and infectivity of baculoviruses are 
described in US Patent No. 5,662,897. 
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D. Insecticidal Proteases 

Additional targets for optimization using the DNA shuffling methods of the 
invention are genes that encode insecticidal proteases. 

E. Vegetative Insecticidal Proteins 

The DNA shuffling methods of the invention can also be applied to 
polynucleotides that encode vegetative insecticidal proteins (VTPs). VIPs are produced by 
som"e~5a677/z/^"sp'ecies~(i ncludi'n g^^ 
phase. See, e.g. , Warren ( 1 997) In Advances in Insect Control, supra. , Chapter 7, pp. 1 09- 
121. The VIPs bear no similarity to the 5-endotoxins produced by 5. ihiringiensis. 

VTPs that are effective against important com pests, such as corn rootworm, 
include, for example, Vipl A(a) and Vip2A(a) (Warren, supra.). Vip3A is effective against a 
broad spectrum of lepidopteran insects (Estruch et ai (1996) Proc, Nat 7. Acad Sci. USA 93: 
5389-5394; Yu et al. (1997) Appl Environ. Microbiol. 63: 532-536). 
F. Pathways for Insecticides 

The invention also provides methods of applying DNA shuffling to obtain 
genes that encode pathways involved in the biosynthesis of natural products that have anti- 
pest activity. 

(1) Polyketides 

One approach that is particularly useful for shuffling of pathways such as 
those involved in biosynthesis of insecticides involves the use of restriction sites to 
recombine mutations, Polyketide clusters, e.g., spinosin, (Khosla et aL^ TIBTECH 14, 
September 1996) are typically 10 to 100 kb in length, specifying multiple large polypeptides 
which assemble into very large multienzyme complexes. Due to the modular nature of these 
-^.complexes-and the modular nature ofthe biosynthetic pathway, nucleic acids encoding - 
protein modules can be exchanged between different polyketide clusters to generate novel 
and functional chimeric polyketides. The introduction of rare restriction endonuclease sites 
such as Sfjl (eight base recognition, nonpalindromic overhangs) at nonessential sites between 
polypeptides or in introns engineered within polypeptides would provide "handles" with 
which to manipulate exchange of nucleic acid segments using the technique described above' 

(2) Other Natural Products 

Several examples are known of natural products that are potent insecticides. 
These products are elaborated by microorganisms, fungi or plants. There are several 
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Baculoviruses are known to recombine in vivo. For example, Croizier et aL 
(( 1 980) C K Acad. Sci, Paris Ser D290: 579-582) reported that AcNP V and Galhina 
mellonella virus recombined in Galleria larvae. More recently, Kondo and Maeda (( 1 99J ) J, 
Virol. 65: 3625-3632) reported widening the host specificity of NPV by recombination in 
5 insect cells. DNA shuffling can expand and accelerate this process. For example, viral 
' genome-shLiflning among severarNPVTpmeswhia"hl[v^^ — 
used to increase the host spectrum. This is accomplished by obtaining NPV's such as 
Auiographa calif ornica, Spodoptera frugiperda and Heliothis virescens are obtained and 
, isolating DNA from the viruses. These DNA samples are mixed'and shuffled. Sf9 cells are 
10 transfected with shuffled and reassembled DNA, and the recombinant virus is isolated. 
Isolated virus samples are then tested for infectivity against insect species such as, for 
example, Trichoplusia m\ Heliothis virescens and Spodoptera exigna. A sublethal dose,, 
. which is determined with the wild-type virus against its original or related host {e,g., AcNPV 
. vs T. ni, SfNPV vs S. exigua), is used. 

15 The insecticidal viruses. that are obtained using the methods of the invention 

are useful for application to plants. Formulations and application methods are known to 
those of skill in the art. See, e.g.. Couch and Ignoffo (1981) mMicrobial Control of Pests 
a?}d Plant Disease 1970-1980, Surges (ed.), chapter 34, pp. 621-634; Corke and Rishbeth, 
Id, chapter 39, pp. 717-732; Brock^well (1980) m Methods for Evaluating Nitrogen Fixation, 

20 Bergersen (ed.) pp. 417-488; Burton (1982) in Biological Nitrogen Fixation Technology for 
Tropical Agriculture, Graham and Harris (eds.) pp. 105- 11 4; andRoughley (1982) Id, pp. 
115-127. 

" "IV. IMPROVED PROPERTIES OF PEST RESIST ANGE GENE S AND SCREENING 
METHODS . 

25 The libraries of recombinant'pest resistance genes thai are produced using the 

DNA shuffling methods described herein are screened to identify those that exhibit 
^ improved properties for use in protecting plants against pests. Included among properties for 
which the methods of the invention are useful for obtaining improved pest resistance genes 
are the following. By choice of an appropriate screening strategy, one can simultaneously or 

30 sequentially obtain genes that are optimized for more than one property. For example, by 
performing shuffling using as one substrate genes that encode highly potent toxins, and as 
another substrate genes that are not easily overcome" by the development of resistance to the 
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gene product by the target, one can obtain an optimized gene that combines the two 
properties of being highly potent and not susceptible to the development of target resistance. 

The invention thus provides the shuffled polynucleotide sequence(s) that 
confer insect resistance on an agricultural organism, and the modified agricultural organisms 
themselves, produced by the method of polynucleotide sequence, shuffling. The exacr 
structures of said produced polynucleotide sequences and modified agricultural organisms' 
are definable most readily by reference to the method by which they are generated. .Thus, the 
invention includes a shuffled polynucleotide sequence conferring the desired phenotype, or a 
plurality thereof, produced by the methods described herein. The shuffled polynucleotides(s) 
produced thereby are easily distinguishable from naturally occurring genome sequences by 
virtue of their atypical modified or novel phenotype(s) which is/are normally not present in 
the population of naturally occurring agricultural organism. The shuffled polynucleotide 
sequence can be further distinguished from naturally-occurring plant, animal/or microbe 
genome sequences by reference to sequence databases and published sequence data, wherein 
the . shuffled polynucleotide will generally comprise a constellation of: mutations as compared 
to the reference data set which would be recognized by the skilled artisan as a polynucleotide 
sequence which is substamially improbable of having evolved by natural evolution or 
classical breeding. 

A. Increased Potency against Target Pests 

The methods of the in vention are usefu l for obtaining pest resistance genes 
that exhibit increased potency against target pests. The shuffled insect resistance genes 
prepared as described above are screened for high insecticidal activity. Such genes can be 
identified by, for example, expressing members of a library of shuffled genes to identify 
those that encode a polypeptide that has an increased EC50 (concentration resulting in 50% 
reduction in insect growth) and/or LC50 (concentration resulting in 50% insect mortality). 

In some embodiments, the invention involves shuffling a gene that encodes a 
toxin having a desired specificity, but relatively low cytotoxicity, with another toxin gene 
that has high cytotoxicity. An illustrative example is Bacillus popilliae, which is pathogenic 
to scarab beetles such as the Japanese beetle and produces an insecticidal protein known as 
CrylSAa (Zhang ei al (1997) J. Bad. 179: 4336-4341). The insecticidal activity of this 
protein, however, is not sufficiently high for use to protect plants fi^om beetle infestation. To 
improve the cytotoxicity of CrylSAa, the gene that encodes this toxin is cloned and shuffled 
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With one or more of its homologous genes from another Bacillus species. For example one 
can shuffle the gene that encodes crylSAa with the B. tharin^ensis gene that encodes C^2 
Other genes that are homologous to crylSAa can also be cloned and shuffled with crylSAa 
For example, one can screen a genomic library of several B. ihuringiensis and B. popilliae 
' strains usmg the cloned cryJSAa gene as a hybridization probe. 

~0nce the-shuffling is-compietedrth-e-TesillfiH^^^^ toxin genes 

.s screened to identify those that exhibit enhanced insecticidal activity. One way of 
performing this screening is to clone the protein coding region of the shuffled genes (for 
example, after PGR amplification) into an expression vector that is suitable for expressmg 
thegenesinachosenhostceIIsuchas,forexampie,£. coll. In presently preferred 
embodiments, the vector includes coding sequences that, when linked in frame to the coding 
sequence of the shuffled gene, encode an easily detectable and^'or immobil,zable ta^ (e , 
multiple H,s residues). The vectors can be introduced into £. coli, as ^vell as into other host 
cells such as a en." strain of Z^. ^l^unng^ens.. If des.red. transformants can be subjected to a ' 
preliminary screen (e.g., by .mmunoassay) to idenury those that produce the insecticidal 
protein. Those that are positive in the preliminarv- screen are then tested in a functional 
screen to identify shuffled genes that encode a toxin having the desired increase in activity. 

A whole pest assay, which is often called an in vivo assay, can be used for 
determining toxicity. In these assays, the toxin polypeptides expressed from the shuffled 
genes are placed on pest diet and allowed to be consumed by the target pest. Preferably the 
shuffled polypeptides are at least partially purified prior to the screemng. For example, wheh'^^ 
E. coli is used as the host ceil for expression of the shuffled polypeptides, the polypeptides 
are often produced as inclusion bodies. The inclusion bodies can be liberated using methods - 
known to those of skill in the art. For example, the E. col, cells can be dissociated using a 
detergent such as B-PER Bacterial. Protein Extraction Reagent (Pierce) according to the 
manufacturer's instructions. The detergent can be removed, e.g., by filtration, and the 
inclusion body dissolved in, for example, 0.02N NaOH. The pH of the solution is then 
neutralized, e.g., by addition of 100 mM Tris-HCl, pH 8. In presently preferred . 
embodiments, the insecticidal protein encoded by the shuffled gene is purified. , 
Conveniently, this can be accomplished using a 96- or more well filter plate that contains an 
affinity reagent (such as Ni-NT A agarose (Qiagen) for a polypeptide that has a histidine tag) 
Preferably, a sufficient number o f host cells is subjected to extraction to ensure that the 
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amount of polypeptide passed through the filter exceeds the capacity of the affinity reagent, 
regardless of the expression level of the particular polypeptide. Upon dissociation from the 
affinity reagent, each sample will then contain a roughly equal amount of protein. 

The amount of polypeptide used in each whole pest test is a sublethal dose, as 
determined using the wild-type polypeptide encoded by the toxin gene used for the shuffling. 
Mortality of the pest is observed to assess the activity level of each polypeptide sample. To 
increase the efficiency of the screening assay, samples can be pooled and tested for activity. 
Pooled samples that show some pest mortality are separated into the individual pool 
components to identify those samples that are responsible for the mortality. Positive samples 
are selected for use, or for a second round of shuffling. 

In preferred embodiments, however, the assays for detecting cell death or cell 
growth are conducted in a format that is more amenable to high-throughput screening. For 
example, an in vitro assay can be used. Such assays typically involve the use of cultured 
insebt cells that are susceptible to the particular toxin being screened, and/or cells that 
express a receptor for the particular toxin, either naturally or as a result of expression of a 
heterologous gene. Thus, in addition to insect cells, mammalian (e.g,^ CHO cells), bacterial, 
and yeast cells are among those that are useful in the in vitro assays. lu vitro bioassays which 
measure toxicity against cultured insect cells are described in,, for example, Johnson (1994) 
J. Invertebr. Pathol. 63: 123-9. In a typical format, a plate having 96 or more wells is used. 
Toxins expressed by the library of shuffled genes are added to the wells and the effect on 
cell viability and/or proliferation is determined. 

One such.assay involves detection of the release of ATPase by cells that are 
killed by optimized toxins obtained using DNA shuffling. The level of ATPase that was 
released by the toxin can be measured at a very high sensitivity level with, for example, a 
lucif erase assay. 

Another assay involves detection of changes in cell morphology due to water 
uptake. When insect cells are intoxicated with Bt Cry protein, for example, the cell 
morphology changes substantially due to water intake. Since the Cry protein makes the qell 
highly permeable, the cells take up a large amount of water when left in a low osmotic 
solution. This morphological change can be detected by light scattering. 

Dyes and labels that are usefiil for detecting cell death or cell growth are 
known to those of skill in the art. In these assays, cells are contacted with the toxin in, for 
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example, a well of a microtiter plate, after which the cells are washed and the uptake or 

retention of the dye or label is measured using a plate reader or plate scintillation counter. 

Suitable dyes include, but are not limited to: 

Alamar blue : The alamar blue assay incorporates a fluorometric/colorimetric 

growth indicator based on detection of metabolic activity. The system incorporates an 

oxidation-re du ction ( Redox) indicator_that_b,oth.fluoresces,andxhanges-Color„in-resp(Snsc^to~ 

chemical reduction of growth medium resulting from cell growth. An aliquot {e,g., 20 of 

Alamar blue is added into each well in the last 8 hr of culture. The plate then is measured by 

absorbance (O.D. 570/600) or by fluorescence. 

•^ H-thymidine Incorporation : The protocol uses as its end-point the 
determination of cell proliferation by measuring the incorporation of "'H-thymidine into 
cellular DNA. An aliquot (e.g., 1 jiCi) of radioactive label is added during the last 4 to 24 hr 
of the culture. A semiautomated cell harvesting apparatus can then be used to lyse the cells 
with water and precipitate the labeled DNA on glass fiber filters. The filter pads can then be 
dried and counted by standard liquid scintillation counting techniques. 

Neutral red : Neutral red is a cationic azine dye used to stain cytoplasmic 
granules of cells. For example, at the crd of the culture, an aliquot (e.^., lOO^il of 1:500 
dilution of 0.5% (w/v) neutral red (Sigma Chemical, St. Louis MO)) is added into each well. 
The cells are then incubated in 5% CO2 at 37°C for 2-4 hrs. The color is subtracted by 50% 
methanol (with 1% acetic acid), and absorbance is measured at 540 wavelength. 

Trypan blue test of cell viability : The dye exclusion test is used to determine 
the number of viable cells present in a cell suspension. It is based on the principle that live 
cells possess intact cell membranes that exclude certain dye,, whereas dead cells do not. In 
this test, a cell suspension is simply mixed with dye and then visually examined to determine 
whether cells take up or exclude dye. A viable cell will have a clear cytoplasm whereas a 
nonviable cell will have a blue cytoplasm. This assay can be carried out by, for example, 
centrifiiging an aliquot of cell suspension for 5min at lOOxg and discarding the supernatant. 
The cell pellet is resuspended in 1 ml PBS or serum-free medium. One part of 0.4% trypan 
blue is mixed with one part cell suspension (dilution of cells). The mixture is allowed to 
incubate about 3 min at room temperature. A drop of the trypan blue/cell mixture is then 
applied to a hemocytometer and observed under a binocular microscope. 

One example of a suitable in vitro assay using cultured insect cells is for the 
BtCTy\QpTotdn^Sf9-{Spodopiera-Jrugipercia)c^^^ 
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sensitive to CrylC protein. Other insect cell lines, such disHeliothis and Trichoplusia spp. 
could also be used for CrylC. Sf9 is not highly sensitive to Cx\'\A proteins. In the case of 
Cryl A and related proteins such as CrylF and Cr>^lG, CFl iChoristonenrafiuniferam) cells 
can be used. CFl cells are highly sensitive to Cry 1 A-type proteins. When the activated . 
CrylC protein was mixed with Sf9 cells, the Cry protein made the cell membrane highly 
permeable to small molecules such as water When a dye such as trypan blue was added to 
the cell suspension, those cells which was killed by the Cry protein was stained with the dye. 
Thus, the insecticidal activity level was determined by image analysis. 

Additional in vitro assays involve the use of receptors for the particular 
toxins. The target sites in insects for several insecticidal proteins, including the Bt Cry 
proteins, are midgut epithelial cells, The toxin protein finds a receptor on the cells and forms 
a specific receptor-Cry protein complex. After binding ihe receptor, the Cry protein goes into 
the cell membrane and forms, a pore to make the cell membrane highly permeable. The cells 
thus lose the osmotic pressure regulation and are eA-eniually killed. It appears that the 
receptor binding step, or affinity of the Cry protein to its receptor, is critical for the 
insecticidal activity level, High affinity of a Cry mutant to the receptor means high 
insecticidal activity. Thus, shuffled genes that encode toxins that exhibit enhanced potency 
against a pest can also be identified on the basis of affinity for a specific receptor for the 
toxin. 

In one example of this type of screening assay, brush border membrane 
vesicles (BBMV; see, e.g., Lee et ai {1995) AppL Environ. Microbiol. 61: 3836-42) are 
used. BBMV, which contain the receptor at a high concentration level, are isolated from 
insects, either from isolated midgut tissue or whole insect body. One advantage of using 
BBMV'is that they can be prepared fi-om"almost any insects of interest. BBMV are typically 
prepared by simply homogenizing whole insects and repeating differential centrifugations, 
e.g.^ between 3000 and 12000 rpm. Since the BBMV fraction is heavier than other fractions, 
it can be easily isolated by centrifugation. In one embodiment of this type of screening 
method, radioactive shuffled toxin proteins are prepared by iodination. The radioactive 
proteins are then mixed with BBMV in 96-well plates and allowed to bind. The BBMV are 
washed by filtration to remove free (unbound proteins). Two sets of plates are prepared with 
identicaLsample sets. One set of plates is incubated for ten minutes and the other for two 
hours before lOOX unlabeled wild-type protein is added. The short reaction time is to 
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determine th6 extent of reversible receptor binding (/Ve., measuring the receptor binding) and 
the long incubation time is to determine membrane insertion, which is not reversible. Thus, 
by using two different incubation periods, one can determine the mode of action of the 
protein. When the shuffled proteins are not highly active, the excess cold wild-type protein 
5 repels the shuffled proteins from binding on BBMV.. BBMV are then filtered to remove the 
, supemate, after which* the amount of label present is measured. This allows determination of 

the amount of shuffled protein that is left on BBMV. 

- ' ■ ■ . ' ■ ■ , ■ . ^■ 

A competitive binding assay is one suitable format for identifying shuffled 
genes that encode toxins having increased affinity for a receptor. For example, a labeled . 

•10 (e.^., radioisotope labeled), non-mutated (wild-type) toxin protein is allowed to hind to am 

' immobilized receptor (e.g., BBMV-bound receptors). After the excess (unbound) protein is 
washed away, a cold (unlabeled) toxin protein isolated from the DNA-shuffled mutant pod! 
as described above is used to compete for binding with the non-mutated toxin proteins. 
When the receptor affmity of a mutated toxin protein is. higher than the non-mutated protein, 

15 the mutant replaces the receptor bound non-mutated protein. Therefore, the amount of label 
ass.ociated with the receptors is reduced. By measuring the amount of label associated with 
, filtered BBMV, for example, the mutants which have the higher affinity to the receptor are 
identified. Those mutants with high receptor affinity can be confirmed as to elevated , 
insecticidal. activity by whole insect assay or cell assay as described above. 

20 The receptor binding assay described above can be applied to insect cells. 

Keeton and Bulla ({1997) Appl. Environ. Microbiol. 63: 3419-3425) demonstrated that a 
mammalian cell line expressing a '"Bt toxin receptor" was sensitive to a class of Cry protein 
called Cry 1 A. The "receptor" gene used by Keeton and Bulla was said to be similar to 
cadherin and has a very limited application, because only a selected few Cry proteins are 

25 known to bind this receptor. Other receptors for Bt Cry' proteins have been identified. Most 
of them were reported to be aminopeptidase N. However, aminopeptidase N has also a 
limited use due to its narrow specificity to the Cry proteins. For example. Cry IC does not 
recognize this receptor protein. However, by cloning a receptor gene specific to a Cry 
protein, which is being studied by DNA shuffling, into a cell line, a specific binding assay; 

30 protocol can be developed. Receptors for many Bt toxins have been characterized (Cryl A 
toxin receptor from the tobacco homworm Manduca sexta (Keeton et ai (1 997) AppL 
Environ, Microbiol 63: 3419-25; Knight et ai (1994) Mo/. Microbiol 11: 429-36; Knight et 
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al. (1995) J. Biol. Chem. 270: 17765-70; Masson ^/ a/. (1995) J. Biol. Chem. 270. 20309-15; 
Vadlamudi etal. (1995)7. Biol Chem. 210: 5490-4), gypsy moth {Lymatitria dispar) 
(Rajamohan et al. (1996) Proc. Nat V. Acad. Sci. USA 93:25, 14338-43), Heliothis virescens 
(Luo etal. {1991) Insect Biochem. Mol. Biol. 27: 35-43 and Gil! etal. (1995) J. Biol Chem. 
270: 27277-82). For Bt toxins, biotinylated proteins can also be used in binding assays (Du 
-et ah(-l996)ylpplrEnvironrMicrob-ior62^^ proteins, when activated, can 

form a pore on liposomes which are made of phospholipids and a dye or radioactive isotope. 
The pore formation due to Cry proteins can be determined by monitoring leaked dye or 
radioisotope. 

In other embodiments, screening is performed by expressing the recombinant 
pest resistance genes as fusion proteins that are displayed on the surface of, for example, a 
phage or other replicable genetic package. The use of phage-display technology to produce 
and screen libraries of polypeptides for binding to a selected target has been described. See. 
e.g, Cvviria ei al. {1990) Proc. Nat 7. Acad Sa. USA 87: 6378-6382; Devlin et al. (199Q) 
Science 249: 404-406; Scott & Smith (1990) Science 249: 386-388; Ladner et al., US Patent ' ' 
No. 5,571,698. Libraries of recombinant pest resistance genes can also be displayed from ''^ 
replicable genetic packages other than phage, such as eukar>-otic viruses and bacteria. Phage * 
display of a Bt CryIA(a).insecticidal toxin is discussed in Marzari et al. (1997) FEES Lett. 
411: 27-3 1 . The phage display libraries can be screened by, for example, identifying those 
20 phage that display a recombinant polypeptide that has an enhanced affinity for an insect ' • 
■ midgut, or for a receptor polypeptide that binds the toxin. 

In an alternative embodiment, the phage display library is subjected to 
consumption by the target insects. DNA that encodes the recombinant pest resistance gene is - 
then amplified from individual insects which die as a result of consuming the phage. For 
25 example, polymerase chain reaction can be employed using as primers two oligonucleotides 
that hybridize to an expression vector at positions which flank the inserted recombinant pest 
resistance gene. ■ 

Another screening method involves the use of transgenic "hairy roots" that 
are generated by Agrobacterium rhizogenes. This bacterium causes hairy root disease in 
30 many plants by transferring a portion of DNA from its Ri (root inducing) plasmid to infected 
plant cells (Zambryski etal. (1989) Cell 56: 193-201). Genes present in the transferred DNA 
(T-DNA) alter the hormone balance in the plant cells causing them to produce roots. Unlike 
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normal plant roots, the hairy roots are readily cultured indefinitely on simple medium such 
as Murashige and Skoog (MS: (1962) Physiol Plant. 15: 473-497). Hairy roots can also be 
induced to regenerate into whole plants (Tepfer (1984) Ce// 37: 959-967). There are no size 
requirements imposed on the T-DN A, which allows one to insert any gene of interest and 
5 have it transferred to the plant cells. This system allows one to rapidly produce hundreds or 
thousands of transgenic roots that express genes that have been created via in vitro shuffling. 
Root tissue is particularly useful for screening nematode resistance and rootworm resistance. 

A schematic diagram of this screening process is shown in Figure 4. A library 
of shuffled toxin genes is created as described above and ligated into. a plasmid that contains 
10 an antibiotic resistance gene {e.g., for kanamycin), an E. coli origin of replication (for 

maintenance), and a region of theRi T-DNA (Tepfer and Casse-Delbait (1987)McroZ)/o/. 
Sci. A: 24-28). The plasmid library can be introduced into A. rhizogems cells by 
electroporation (Main etai {\99S) Methods h^loL Biol. 44:405-412) and the cells are plated 
on a suitable medium {e.g.., MYA medium (8.0 g/L, rnannitol, 5.0 g^L yeast extract, 2 0 g/L 
15 ammonium sulphate, 0.5 g'L casamino acids and 5.0 g/'L sodium chloride, pH 6.6) 
containing 25-200 |ig/ml kanamycin or other selection reagent) and incubated at 
approximately 28''C for several days. Only .cells in which the plasmid has integrated into the 
endogenous Ri plasmid by homologous reconibination in the T-DNA region sui^ive 
selection because the plasmid can not freely replicate in A. rhizogenes. All of the colonies 
20 are washed from the plates and pooled for use as inocula on the plant tissues, 

Plant tissues are then inoculated with the colonies. Many different dicot and 
monocot species, including Soybean (Glycine max), can be induced to form hairy roots by A. 
rhizogenes (De Cleene and De Ley (1981) Bot, Rev. 47: 147-94). The plant tissues {e.g, 
seedlings) are typically surface- sterilized, after which hypocotyl segments are cut and 
25 inserted apical end down in solid MS medium in 24- or 48-well plates. A drop of the A. 

rhizogenes inoculum is applied to the end of the tissue section and the plates are incubated at 
26-28°C in the dark until roots appear (1-4 weeks). Untransformed plant cells will not 
produce roots on MS medium. Thus, roots that form are assumed to be transformed and need 
not be subjected to antibiotic selection. Preferably, however, the .4. rhizogenes is killed by ^ 
30 removing the roots from the petioles and culturing them on MS medium supplemented with 
500 |ig/ml carbenicillin or cefotaxime. Cultured hairy roots grow rapidly and can be 
subdivided several times to provide replicates for screening experiments. 
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Independently transformed root lines are infected with nematodes and 
assayed for cysts and nematode death, or they are provided to second or third instar larvae - 
and screened for insecticidal activity (larval death). The best root lines that survive nematode 
or insect attack are chosen and the toxin genes are reisolated, e.g.\ by PGR with primers 
matching the plasmid sequence surrounding the cloning site at which the shuffled genes 
jwere inserted. Jn^preferred-embGdimentSrthese genes-are-mixedrBNase treatedrreassembled 



and shuffled. A second round of introduction into A. rhizogenes and infection of plant tissue 
is carried out. These cycles can be repeated until the desired level of pest resistance is 
acquired. The final evolved toxin gene is isolated and used to transform the desired plant 

10 cultivar in a manner conducive to regenerating fertile commercially viable plants. 

The system is also useful for identifying genes that encode previously 
unknown toxins, or toxins for which the genes were not previously available. When the goal 
of the first round of screening is to identify a previously unknown toxin gene, a genomic 
library fi:-om the source organism can be made in the Ri plasmid. To facilitate cloning, 

15 linkers that contain an infrequently cleaved restriction site (e.g., Notl) are added to genomic' 
fragments and cloned into the coli vector for delivery into A. rhizogenes. The remainder 
of the assay is as described except that the initial recovery of genes from surviving roots is 
followed by gene characterization and shuffling of all or part of the genomic sequences. 

Insect pathogenes from which it is desirable to obtain toxin genes include, for 

20 example, microbes such 2iS Bacillus thuringiensis* (various insects). Bacillus sphaericus* 
(mosquito), Bacillus popilliae* (beetle), Bacillus le)inmorhus (beetle), Bacillus larvae (bee). 
Bacillus moritai (house fly), Clostridium brevifaciens* (caterpillar), Clostridium 
malacosomae'^ (caterpillar), Pseudomonas aeruginosa (various insects incl. grasshopper), 
Enterobactor cloacae (locust), Enterobactor aerogeiies, Serratia marcescens (various 

25 insects), Serratia entomophila (beetle), Serratia liquefaciens (various insects), Proteus 

. vulgaris (grasshopper), Xenorhabdus nematophilus"^ (beetle), Streptococcus faecalis (various 
insects), Rickettsiella popilliae* (beetle), Rickettsiella rrielolonthae'^ (beetles, caterpillar), 
and Mycoplasma/Spiroplasma (* indicates pathogens presently known to produce 
insecticidal proteins, others may produce the toxin), 

30 Viral pathogens include, for example, Baculovirus (including Nuclear 

polyhedrosis virus. Granulosis virus, and Nonoccluded virus), Polydnavirus (including 
Ichno virus and Braco virus). Poxvirus, Asco virus, Iridovirus, Nodavirus, PicoRNAvirus, 
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Tetravims, Reovims (including Cytoplasmic polyhedrosis viais an^ 
Birnavirus, Rhabdovirus, Togavirus, Flavivirus, and Bunyavirus. 

Fungal insect pathogens include, for example, Cori^>^6x^/?5 spp., .S'/ro/^^ 
spp., Zoophthora amuemis, Beanveria bassiana, Beauveha brongiiiartii, Paecilowyces 
5 . fumosoroseus, Verticillium iecanii, Metarhizium flavoviride, Metarhizmm anisopliae, 
Lagenidwm gigantidm, Nomuraea rileyi, Nomuraea cylm^ Pandora moaphidis. 

Pandora delphacis, Neozygites floridana„Hirsutella.thompsouU^ Nilaparvata lugens, Erynia 
neoaphidis, md Massospora spp. 

Nematodes that are pathogenic to insects include, for example, Tetradonema 

1 0 pjicans (fly), Mermis nigrescens (grasshopper), Romanomermis culiciyorax (mosquito), 
Agramermis cfecaWa^tz (grasshopper), Rhabditis insectivora (beetle), Steinernema^pp 
(beetle, caterpillar) (symbiotic bacteria of these nematodes (e,g.^ Xeriorhabdus/Photorhabdus 
spp.) produce toxins, Steinernenia carpocapsae, Siem^^rncma glaserl, Steinernema kusidaK 
Etfdiplogaster aphodii (he^ik), Deladehits siricidicoia s isp), Coniorrylenchus spp. (beetle), 

15 Heterotylenchiisaimmmalis(^y),dTiASphaerular{a 

Other applications of root lines transformed with shuffled libraries include 
uptake and utilization of solutes, nutrients, or chemicals (Tepfer et al. (1989) Plant h4oi 
Biol 13: 295-302). Also ftmgal infections, Rhizobium nodule formation, and secondarv: 
metabolite formation can be screened using hairy roots (Tepfer and Casse-Delbart (1 987) 

20 M/croWo/.^c/. 4:24-28; Saito era/. (1992)7. A^a^^^ 

There are several possible variations m the transgenic plant tissue screening 
method described here. First, A. tumefaciens, which is more widely used than^. rhizogenes, 
can deliver the shuffled gene library. Disarmed, binary versions of both strains 
(Walkerpeach and Velton (1994) Plant Molecidar Biology Manual, B 1 : 1-19) allow genes to 

25 be transferred with antibiotic markers in the absence of native T-DNA disease-causing genes 
to select for transformed cells that can be induced to form callus, roots, shoots or whole • 
plants depending on the tissue type that the pest in question will attack. For cereal and grain 
producing plant species, other plant transformation methods such as particle gun 
bombardment (Barcelo and Lazzari (1995) Methods Mol. Biol 49: 1 1 3-123) can be used to 

30 create transgenic tissues for screening. 
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B. Increased Target Range 

The invention also provides methods of using DNA shuffling to obtain pest 
resistance genes that are effective against a broader range of insects, nematodes, or other 
pests than a naturally occurring gene. For example, one can apply DNA shuffling to families 
of genes that code for toxins having different target specificities and screen for those that 
exhibit toxicity against a desired target pest against which a toxin encoded by a naturally 
occurring gene was less effective. Specific examples of genes that one can shuffle to obtain 
enhanced target range include, but are not limited to: 

(i) Bt toxin genes can be shuffled to obtain higher activity vs. corn root worm 
and other coleopteran pests. 

(ii) Bt toxin genes can also be shuffled to enhance activity vs. other specific 
pests belonging to different order like lepidoptera and diptera. 

(iii) Bt toxin family of genes can also be shuffled to obtain new activity vs. 
insect pests that have developed resistance {Nature Bioiech,^ Sep. 1997 - p.' 816) to existing 
toxins. 

(iv) Other genes coding for toxins such as cholesterol oxidase, protease 
inhibitors, lectins, etc. (Asgrow Reports - Genetic En.ginceringfor Fest Control^ Len 
Copping, Chapters 2. 1-2.4), can be shuffled to enhance the potency as well as spectnim. 

Screening to identify' members of libraries of shuffled genes that encode 
toxins having increased toxin range include both //) v/vo and in vitro assay formats as 
described above. Again, in vitro assays are generally preferred because of their greater 
amenability to high throughput screening. Assays for insecticidal spectrum using larval 
insect midgut {see, e.g., Van Rie et al (1989) Eur, J. Biochem. 1 86: 239-47). Receptors for 
the toxins, either expressed in cell" lines, or as BBMV,"can be used as described above. 

Generally, cells or receptors that are not susceptible to, or do not strongly 
bind, a naturally occurring toxin of interest, are chosen for use in the assays. The library of 
recombinant toxins are tested to identify those that are active against the target cells, and/or 
that exhibit a high affinity for the target receptor. ^ 

C. Decreased Susceptibility to Development of Resistance by Pests 

One problem that is often observed when using biopesticides is the target 
pest's development of resistance to the pesticide due to selective pressure on the pest 
populations {see, e.g., Kumar et al., supra.). The present invention provides methods of 
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obtaining recombinant pest resistance genes that are less susceptible than naturally occurring 
genes to the development of resistance. 

Selection for optimized recombinant insect resistance genes that are less 
' susceptible to the pest becoming resistant can involve, for example, feeding diverse (e.g., 
5 members of a library of shuffled genes) to a breeding population of insects and determining 
for each clone how quickly resistance occurs. An alternative approach is to use 2 or more Bt 
toxins, preferably diverse Bt toxins so that resistance to both would be difficult to obtain. 
Different combinations of genes can be assayed as described above to determine the ease of 
development of resistance to both genes. 
10 One example of a scheme for obtaining a Bt toxin that is less susceptible to 

the development of resistance is as follows. Diamondback moths easily develop resistance 
against Cry 1 A, a potent and widely used Bt toxin. These 'resi.siant moths are still sensitive to 
CrylC because Cry IC binds to a receptor different irorvi that ro: Cryl A's, but CrylC is 
much less potent than Cry l A. One can use DNA shuffling as described herein to increase ihe 
15 potency of CrylC so that it is more effective against the res'istant insects. These. screening 
tests can be done in Spodoterafugiperda Sf9 insect cells, since Sf9 cells are sensitive to 
CrylC but not to Cry 1 A. The assays can be performed either on unmodified Sf9 cells or on 
■ other insect cell lines (such as Heliothis sp,, Tnchoplitsia ni or DiahroUca sp. (corn 
rootworm)) which are transfected with the gene, for the Cvs^lC receptor {see, e,g,, de Maagd 
20 et al. (1996) Appl Environ, Microbiol. 62: 2753-7). 
D. Increased Expression Level 

In another embodiment, the invention provides methods of increasing the 
expression levels of pest resistance genes. This can be accomplished through optimization of 
the genes themselves, for example, by altering the CG content of the genes to more closely 
25 match that of plants, or improving codon usage through use of the DNA shuffling methods 
of the invention. - • . 

Alternatively, increased expression can be achieved by using DNA shuffling 
to obtain improved promoters and other gene expression control signals. Usually, a pest 
resistance gene is operably linked to an additional sequence, such as a regulatory sequence, 
30 to ensure its expression. These regulatory sequences can include one or more of the 
following: an enhancer, a promoter, a signal peptide sequence, an intron and/or a 
polyadenylation sequence. The efficacy of a pest resistance gene often depends on the level 
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of expression of an gene product by the plant or other host. An optimized promoter and/or 
other control sequence is likely to result in improved pest resistance. Moreover, it is 
sometimes desirable to have control over the type of cell in which a gene is expressed, 
and/or the timing of pest resistance gene expression; For example, the development of 
5 resistance to a pest resistance gene can be delayed or eliminated by using a promoter that is 

inducible orotherwise-capable-of-directing-expression-of the resistance^gen'e 

noncontinuously. The methods of the invention provide for optimization of these and other 
factors which are influenced by promoters and other control sequences. 

Expression can effectively be improved by a variety of means, including 
1 0 increasing the rate of production of an expression product, decreasing the rate of degradation 
of the expression product or, improving the capacity of the expression product to perform its 
intended function. The methods involve subjecting to DNA shuffling polynucleotides which 
are involved in control of gene expression. At least first and second forms of a nucleic acid 
that comprises a control' sequence, which forms differ from each other in two or more 
15 nucleotides, are recombined as described above. The resulting library of recombinant control 
sequences are screened to identify at least one optimized recombinant control sequence that 
exhibits enhanced strength, inducibility, or specificity, 

The substrates for recombination can be the full-length vectors, or fragments 
■ thereof, which include a coding sequence and/or regulatory sequences to which the coding 
20 sequence is operably linked. The substrates can include variants of any of the regulatory 
and/or coding sequence(s) present in the vector. If recombination is effected at the level of 
fragments, the recombinant segments should be reinserted into vectors before screening. If 
recombination proceeds in vitro, vectors containing the recombinant segments are usually 
introduced into cells before screening. 
25 Cells containing the recombinant segments can be screened by detecting 

expression of the gene encoded by a selection marker. For purposes of selection and/or 
screening, a gene product expressed from a vector is sometimes an easily detected marker 
rather than a product having an actual therapeutic purpose, e,g., a green fluorescent protein 
{see, Crameri (1996) Nature BiotechnoL 14: 315-319) or a cell surface protein. For example, 
30 if this marker is green fluorescent protein, cells with the highest expression levels can be 
identified by flow cytometry-based cell sorting. If the marker is a cell surface protein, the 
cells are stained with a reagent having affinity for the protein, such as antibody, and again 
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analyzed by flow cytometry-based cell sorting. Drug resistance genes can also provide a 
selectable marker. Alternatively, the gene product can be a fusion protein comprising any 
combination of detection and selection markers. Internal reference marker genes can be 
included on the vector to detect and compensate for variations in copy number or insertion 
5 .site. ; , V 

Recombinant segments from the cells showing highest expression of the 
marker gene can be used as some or all of the substrates in a ftirther round of recombination 
and screening, if additional improvement is desired. The optimized control regions can then 
be used for the expression of pest resistance genes in transgenic plants, or in microorganisms 
1 0 ' that are applied to plants of interest, including microorganisms that can colonize the plants. 

E. Increased Resistance to Protease Degradation 

Insect midgut fluids contain proteases, so resistance to protease degradation is 
a desirable property of pest resistance gene products. The present invention provides 
methods of obtaining recombinant pest resistance genes that encode polypeptides that exhibit 

15 increased resistance to proteases Typically, a hbrarv; of recombinant genes is screened by 
expressing the gene products, and testing to identify those that regain their integrity and 
, pesticidal activity when placed in the presence of a protease. For example, pools of shuffled 
genes can be expressed, and the gene products incubated in the presence of insect midgut 
fluids or other media that contain relevant proteases. The integrity of the polypeptides can be 

20 determined by, for example', gel electrophoresis or by an appropriate bioassay. Those pools ^ 
that contain protease-resistant gene products can be sub-divided and retested to identify 
those library members that encode protease-resistant gene products., . 

F. Increased Stability in Environmental Conditions 

Another property for which improvement is desirable is the ability of pest 
^ 25 resistance gene products to withstand extremes of pH and other conditions that are prevalent 
at the sites of action in target pests. Midgut fluids of Coleoptera and Hemiptera, for example, 
are often at a relatively low pH (about pH 3-6), while those of most other insect guts are at a 
relatively high pH (about pH 8-11). Inactivation by exposure to ultraviolet light is a.major 
problem that can limit the use of insect-pathogenic virus formulations, for example, as 
30 sprayable insecticides. After the insecticides are sprayed onto crops to protect them from . 
insect damage, the virus is quickly inactivated by sunlight, particularly UV light. 
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Screening for these optimized shuffled genes can be performed in a similar 
manner to testing for protease resistance as described above. For example, pest resititance ' 
gene products are placed under conditions that are found at the site of action. Those library 
members that encode gene products having increased stability under the test conditions are 
identified. 

To enlreicetHF'pfoBabilityof obtaining genes that encodepolyjp^ides 
having reduced UV light sensitivity, one can include in the shuffling reaction 
oligonucleotides that include codons for amino acids that are not highly sensitive to U V 
light. One suitable method to screen for UV resistant pathogenic virus formulations is as 
follows. In the case of Autographa californica nuclear polyhedrosis virus (AcNPV), the 
entire viral genome is shuffled. First, AcNPV is exposed to a dose of UV, which is set at a 
level for only 5% of virus survival. The virus that sur\'ive the UV treatment are plaque 
purified on Sf9 {Spodopiera frugiperda) cells, propagated mThchoplusia m (cabbage 
looper) and subjected to the second treatment. This process is repeated several times. 
The viral genome is isolated from the sur\'iving population pool after several passages under 
UV. The UV-resistant viral DNA is mixed with DNA from wild-type- virus and shuffled. 
Sf9 cells are transfected with reassembled DNA, and virus is isolated. After this lA^- 
selection cycle, a several virus clones, which are UV resistant and show no other obvious 
changes in the phenotypes such as infectivity and speed of kill, are obtained. 
G. ^ Reduced Toxicity to a Host Plant 

Shuffled genes that are prepared using the DNA shuffling methods of the 
invention can also be screened to identify those that exhibit reduced toxicity to a host plant 
compared to a naturally occurring gene. The genes can be introduced into plants or plant 
cells to identify those that are relatively nontoxic, or the gene products can be assayed for 
toxicity against plants or plant cells. 

V USES OF OPTIMIZED PEST RESISTANCE GENES 

The optimized pest resistance genes produced using the methods of the 
invention find uses both in vitro and in vivo. For example, the genes having improved anti- 
pest activities can be used in vitro to study the mechanisms by which plants can be protected 
against pests, and for production of pesticides that can be applied to plants. The optimized 
pest resistance genes can be introduced into microorganisms that colonize plant surfaces, or 
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can be introduced into plants themsel ves. In each case, expression of the pest resistance gene 
is capable of conferring upon the plant resistance to the pest. 
A; Production of Pesticides 

The optimized pest resistance genes can be used for the recombinant 
5 production of polypeptides that are useful as pesticides. Typically, an' optimized gene is 
introduced into an expression cassette for high level expression in a desired host cell. A 
typical expression cassette contains a promoter operably linked to the' desired DN A 
sequence. More than one optimized pest resistance gene can be expressed in a single 
prokaryotic cell by placing muhiple transcriptional cassettes in a single expression vector, by 
1 0 ^ constructing a gene that encodes a fusion protein consisting of more than one pest resistance 
gene, or by utilizing different selectable markers for each of the expression vectors v^'hich are 
employedjn the cloning strategy. 

'Optimized pest resistance genes of the invention- can be expressed in a variety 
of host cells, including E. coli, other bacterial hosts, yeast, and various higher eukaryotic 
15 cells such as the COS, CHO and HeLa cells lines and myeloma cell Imes. Examples of 
useful bacteria include, but are not limited to, Escherichia^ Enierobacte.t\ Azoiohacter. 
Erwinia, BaciUus, Pseudomonas, Klebsiella, Proteus, Salmonella, Serratia, Shigella, 
, Rhizobia, Vitreoscilla, Paracoccus. The recombinant gene will be operably linked' to 
appropriate expression control sequences for each host. ■ For E. co//,this includes a promoter 
20 such as the T7, trp, or lambda promoters, a ribosome binding site arid preferably a 

transcription termination signal. For eukaryotic cells, the control sequences will include a 
promoter and preferably an enhancer derived from immunoglobulin genes, SV40, 
cytomegalovirus, etc, and a polyadenylation sequence, and may include splice donor and 
acceptor sequences. * . ' - 

» 25 ! In a preferred embodiment, the expression cassettes are useful for expression 

of pest resistance genes in prokaryotic host cells. ' Commonly used prokaryotic control 
sequences, which are defined herein to include promoters for transcription initiation, 
optionally with an operator, along with ribosome binding site sequences, include such - 
commonly used promoters as the beta-lactamase (penicillinase) and lactose {lac) promoter 
30 systems (Change et al. Nature (1977) 198: 1056), the tryptophan {trp) promoter system 
(Goeddel et al. Nucleic Acids Res. (1980) 8: 4057), the tqc promoter (DeBoer, et aL, Proc, 
NatL Acad ScL U.S.A. (1983) 80:21-25); and the lambda-derived Pl promoter and N-gene 



wo 99/57128 PCT/US99/08473 

' 63 

ribosome binding site (Shimatakeer a/., 7Va/wre( 198 1)292: 128). The particular promoter 
system is not critical to the invention, any available promoter that functions in prokaryotes 
can be used. 

Either constitutive or regulated promoters can be used in the present 
invention. Regulated promoters can be advantageous because the host cells can be grown to 
—high-densities-before expression of the-pestTesistan;ce"poly^^ 
expression of heterologous proteins slows cell growth in some situations. Regulated 
promoters especially suitable for use in E. coli include the bacteriophage lambda Pl 
promoter, the hybrid trp-lac promoter (Amann et ai. Gene (1983) 25: 167; de Boer e?/ a/., 
Proc, Nail. Acad Sci. USA (1983) 80: 21, and the bacteriophage T7 promoter (Studier^/a/., 
1 Mol. Biol. (1986), Tabor a/., (1985). These promoters and their use are discussed in 
^Sambrook et al.^ supra. 

For expression of pest resistance polypeptides in prokar\'otic cells other than 
E. coli, a promoter that functions in the particular prokaryotic species is required. Such 
promoters can be obtained from genes that have been cloned from the species, or -^ly , . 

heterologous promoters can be used. For example, the hybrid trp-lac promoter ftjnctions in . 
Bacillus in addition to E. coli. Promoters suitable for use in eukaryotic host cells are well • 
known to those of skill in the art. 

A ribosome binding site (RBS) is conveniently included in the expression 
cassettes of the invention that are intended for use in prokaryotic host cells. An RBS in E. 
coli, for example, consists of a nucleotide sequence 3-9 nucleotides in length located 3- 1 1 ^^^\ ' 
nucleotides upstream of the initiation codon (Shine and Dalgarno, Nature (1975) 254: 34; 
Steitz, In Biological regxdation and development: Gene expression (ed. R.F. Goldberger), 
vol. 1, p. 349, 1979, Plenum Publishing, NY). 

Translational coupling can be used lo enhance expression. The strategy uses 
a short upstream open reading frame derived from a highly expressed gene native to the 
translational system, which is placed downstream of the promoter, and a ribosome binding 
site followed after a few amino acid ciodons by a termination codon. Just prior to the 
termination codon is a second ribosome binding site, and following the termination codon is 
a start codon for the initiation of translation. The system dissolves secondary structure in the 
RNA, allowing for the efficient initiation of translation. See, Squires et, al (1988) J. Biol. 
Chem. 263: 16297-16302. 
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, The pest resistance polypeptides can be expressed intracellularly, or can be 
secreted from the cell. Intracellular expression often results in high yields. If necessary, the 
amount of soluble, active pest resistance polypeptide may be increased by performing 
refolding procedures. (^e^?, e.g., Sambrook eta/,, supra.; Marston etqj., Bio/Technology 

(1984) , 2; 800; Schoner et al, Bio/Technoiogy (1985) 3: 151). In embodiments in which the ' 
pest resistance polypeptides are secreted from the cell, either into the periplasm of into the 
extracellular medium, the DNA sequence is linked to a cleavable signal peptide sequence. 
The signal sequence directs translocation of the pest resistance polypeptide through- the cell 

, membrane. An example of a suitable vector for use in E. co// that contains a promoter-signal 
seiquence unit is'pTA1529, which has the Ei co/i phoA promoter and, signal sequence (see, 
e.g., Sambrook ^/ a/., svpra., Ok?iet ai^ Proc. Nat!. Acad ScL (yS4 (1985)-82: 7212; 
Talmadge eta/., Proc. Natl. Acad. Scl USA (i980) Takahara eta/., J. Biol. Ghem 

(1985) 260; 2670). * ; ; ' , , ' 

The. pest resistance polypeptides of the invention c^n'also be produced as 
_ fusion proteins. This approach often results in high yields, because normal prokaryotic 
. control sequences direct transcription and translation. In E. coli^ lacZ ftjsions are often used 
to express heterologous proteins. Suitable vectors. are readily available,' such as the pUR, 
pEX, and pMRlOO series (see, e.g.^ Sambrook eta!., supra.). Forcertain applications, it 
may be desirable to cleave the non- pest resistance polypeptide amino, acids from the ftision- ' 
protein after purification. This can be accomplished by any of several Riethods known in the 
art, including cleavage by cyanogen bromide, a protease, or by Factor, Xa {see, e,g:, 
Sambrook et at., siipra.\ Itakura ei a!.. Science (1977) 198; 1056; Goeddei et a!., Proc, . . 
Natl. Acad Sci. USA {\919) 16: 106; Nagai e/a/., A^a/w/-e (1984) 309: 810; Sung era/., 
Proc, Natl Acad. Sci. USA (1.986) 83: 561). Cleavage sites can be engineered into the gene 
for the fiision protein at the desired point of cleavage. • • 

. . , A suitable system for obtaining recombinant proteins from which ^ 

maintains the integrity of their N-termini has been described by Miller et al. (1989) 
Biotechnology 7:698-704. In this system, the gene of interest is produced as a C-terminal 
fiision to the first 76 residues of the yeast ubiquitin gene containing a peptidase cleavage 
site. Cleavage at the junction, of the two moieties resuhs in production of a protein having an 
intact authentic N-terminal reside. 
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The expression vectors of the invention can be transferred into the chosen 
host cell by well-known methods such as calcium chloride transformation for £ coU and 
calcium phosphate treatment or electroporation for mammalian cells, Cells transformed by 
the plasmids can be selected by resistance to antibiotics conferred by genes contained on the 
5 plasmids, such as the a/w/?, gp/, wo and /zyg- genes. 

^ > -Once^expressed,-the-recGmbinant-pest-resistanee-polypeptides can-be'purified 

according to standard procedures of the art, including ammonium sulfate precipitation, 
affinity columns, column chromatography, gel electrophoresis and the like {see, generally, 
R. Scopes, Protein Purification, Springer- Verlag, N Y. (1982), Deutscher, Methods in 
10 Enzymology Vol 182: Guide to Protein Purification., Academic Press, Inc. N Y. (1990)). 
Substantially pure compositions of at least about 90 to 95% homogeneity are preferred, and 
98 to 99% or more homogeneity are most preferred. Once purified, partially or to 
homogeneity as desired, the polypeptides may then be used {e.g., as immunogens for 
antibody production). 

15 One of skill would recognize that modifications can be made to the pest 

resistance polypeptides without diminishing their biological activity. Some modifications 
may be made to facilitate the cloning, expression, or incorporation of the targeting molecule 
into a fusion protein. Such modifications are well knov^n to those of skill in the an and 
include, for example, a methionine added at the amino terminus to provide an initiation site, 

20 or additional amino acids {e.g., poly His) placed on either terminus to create conveniently 
located restriction sites or termination codons or purification sequences. 

The polypeptides encoded by the optimized pest resistance genes can be 
formulated for application to plants as is known to those of skill in the art. For Bt toxins, for 
example, one or more forms of the toxin (e.g., crystals, crystal proteins, protoxin, toxin, and 

25 insecticidally effective portions of the toxins) can be formulated for application to plants, or 
for assays of insecticidal activity. The active pest resistance polypeptide can be formulate 
with suitable carriers, diluents, emulsifiers and/or dispersants. This insecticide composition 
can be formulated in any of multiple forms, such as a wettable powder, pellets, granules or a 
dust, or as a liquid formulation with aqueous or non-aqueous solvents as a foam, gel, 

30 suspension, concentrate, etc. The concentration of the active ingredient in such a 

composition will depend upon the nature of the formulation and its intended mode of use. 
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For extended protection {e:g., for a whole growing season), additional amounts of the 
composition can be applied periodically. 

The pesticidal polypeptides can be formulated in a dry, solid unit dosage 
form, such as capsules, boluses or tablets that contain the desired amount of active 
5 compound. These dosage forms are prepared by.mixing the active ingredient with suitable 
' diluents, fillers, disintegrating agents and/or binders such as starch, lactose, talc, magnesium 
stearate, vegetable gums and the like. Such unit dosage formulations may be varied widely 
with respect to their total weight and content of the pesticidal agent, depending upon the 
factors such as the type, of plant to be treated and the severity and type of infestation. 
10 . B. Treatment of Plants with MicroorRanisms that Express Optimize d , Pest 

Resistance Genes --^ 
The optimized insect resistance genes, or insecticidaily effective portions 
thereof; can be introduced into microorganisms that can coionize plarus. Ingestion by a pesi 
of a plant upon which the microorganisms are present r esuiis in ihe gene product of the pes: 
15 resistance gene causing the death of the pest. Microbes capable of colonizing plant 

phytospheres are described in, for example, US Patent No. 5,281,532 and European Patent 
Application 0 200 344. Methods of introducing and expressing genes into microorganisms 
are described herein and are otherwise well known to those skilled in the art (see, e.g., U.S. 
Pat. No. 5,135,867). " 
20 Microorganism hosts are selected which are known to occupy the 

'^phytosphere" of one or more crops of interest. These microorganisms are selected so as to 
be capable of successfully competing in the particular environment (crop and other insect 
habitats) with the wild-type microorganisms, provide for stable maintenance and expression 
of the gene expressing the polypeptide pesticide, and, desirably, provide for improved 
25 protection of the polypeptide from environmental degradation and inactivation. Host 

microorganisms of particular interest include prokaryotes and the lower eukaryotes, such as 
fungi. Illustrative prokaryotes, both Gram-negative and -positive, in^^^ 
Enterobacteriaceae, such as Escherichia, Erwinia, Shigella, Salmonella, andProteiis\ 
Bacillaceae\ Rhizobiceae, such as Rhizobium; Spihllaceae (including photobacterium), 
30 Zymomonas, Serratia, Aeromonas, Vibrio, Desulfovibrio, Spirillum, Lactobacillaceae, 
Pseudomonadaceae, such as Pseudomonas and Acetobacter\ Azotobacteraceae and 
Nitrobacteraceae. Among eukaryotes are fiingi, such as Phycomycetes and Ascomycetes, 



wo 99/57128 ^_ PCT/US99/08473 

67 

which includes yeast, such as Saccharomyces eind Schizosaccharomyces; and Basidiomycete 
yeast, such as Rhodotorula, Aureobasidium, Sporobolomyces, and the Hke. 

Application of microorganisms transformed with optimized pest resistance 
genes to. plants can be accomplished using methods known to those of skill in the art {see, 
5 e.g., US Patent No. 5,281,532. Typically, the transformed microorganism is applied to its 
natural"habitat,"such'as thcThizosphere or phylloplane"of 
pest. The microorganisms grow in their natural habitat, and produce the pesticidal agent 
encoded by the pest resistance gene. The agent is absorbed and/or ingested by the larvae or 
adult pest, or have a toxic effect on the ova. Long-term protection of the plants is provided 

1 0 by the persistence of the microorganisms, but repetitive administrations may be required 

from time to time. The recombinant organisms can be applied by spraying, soaking, injection 
into the soil, seed coating, seedling coating or spraying, or the like. Where administered in 
the field, generally concentrations of the organism will be from 10" to 10^'^ cells/ml, and the 
volume applied per hectare will be generally from about 0. 1 oz to 2 lbs or more. Where 

15 administered to a plant part, the concentration of the organism will usually be from lO"^ to 
10^ cells/cm-. 

C. Introduction of Insect Resistance Genes into Plant Cells 
In another embodiment, the optimized recombinant pest resistance genes 
produced as described herein are introduced into plant ceils, including plant cells that are 
20 present in an intact plant or plant part. Expression of the recombinant resistance gene then 
confers resistance upon the plant or plant part. 

The invention provides expression cassettes that are useful for expressing 
■ optimized pest resistance genes in plants. In addition to the optimized pest resistance gene, 
the expression cassettes include polynucleotide sequences that ftinction to direct expression 
25 of the gene. The expression cassettes typically include proper transcriptional initiation 
regulatory regions, /.e., a promoter sequence, an intron, and a polyadenylation site region 
recognized in the host plant of interest, all linked in a manner which permits the transcription 
of the coding sequence and subsequent processing in the nucleus. These sequences can be 
derived from any source, such as, virus, plant or bacterial genes. One example of a preferred 
30 source for transcription promoters and terminators is plant viruses such as, for example, 
cauliflower mosaic virus (CaMV), which is described in Hohn et al. (1982) Curr. Topics 
Microbiol, Immunol. 96: 194-220 and Appendices A to G. CaMV has at least two promoters 
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that are functional in plants, namely the 1 9S promoter, which results in transcription of gene 
VT of CaMV, and the promoter of the 35S transcript. The CaMV 35S or 19S promoters may 
be enhanced by the method described in Kay et al. (1987) Science 2'36: 1299-1302. 

Promoters and other control sequences from plant genes are also suitable for 
5 ' use in the expression of pest resistance genes prepared using the methods of the invention. 
Examples include those from a gene that encode the small subunit of ribulose bisphosphate 
carboxylase, and from a gene that codes for chlorophyll a/b-binding protein. See, e.g., 
Morelli et ai (1 985) Nature 315: 200-204. Other suitable promoters include the full-length 
transcript promoter from Figwort mosaic virus, ubiquitin promoters, actin promoters, histone 

10 promoters, tubulin prompters, or the mannopine synthase promoter (MAS). One can use a 
promoter that causes preferential expression in a particular tissue, such as leaves, stems, 
roots, or meristematic tissue, or the promoter may be inducible, such as by light, heat stress, 
water stress or chemical application or production by the plant. Exemplar}' green tissue- 
specific promoters include the maize phosphoenoi pyruvate carboxylase (PEPC) promoter, 

15 small submit ribulose bis-carboxylase promoters (ssRL^ISCO) and the chlorophyll a^'b 
binding protein promoters. The promoter may also be a pith-specit'ic promoter, such as the 
promoter isolated from a plant TrpA gene as described in International Publication No. 
W093/07278. 

Bacterial genes that are expressed in plants are another source of suitable 

20 control regions. These include those present in the T-DNA region of Agrobacterwm 

plasmids such as, for example, Ti plasmid of ^. tumefaciem md the Ri plasmid of A. 

rhizogenes. Particularly pv^f^md Agrobacterium promoters and 5' and 3' untranslated 

regions for use in the expression of optimized pest resistance genes include, for example, 

those of the genes that code for octopine synthase and nopaline synthase. See, e,g,, Bevan e( 

25 a/. (1983) vVaft/re 304: 184-187. 

A variety of techniques for introducing genes into plant cells and obtaining 

expression of the genes are known in the art. Methods are known for introduction and' 

expression of heterologous genes in both monocot and dicot "plants. In addition to Berger, 

Ausubel and Sambrook, useful general references for plant cell cloning, culture and 

30 regeneration include Payne et al. (1992) Plant Cell and Tissue Culture in Liquid Systems 

John Wiley & Sons, Inc. New York, NY (Payne); and Gamborg and Phillips (eds) (1995) 

Plant Cell, Tissue and Organ Culture; Fundamental Methods Springer Lab Manual, 
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' Springer- Verlag (Berlin Heidelberg New York) (Gamborg). Cell culture media are 
described in Atlas and Parks (eds) The Handbook of Microbiological Media (1993) CRC . 
Press, Boca Raton, FL (Atlas), Additional information is found in commercial literature 
such as the Life Science Research Cell Culture catalogue (1998) from Sigma- Aldrich, Inc 
5 (St Louis, MO) (Sigma-LSRCCC) and, e.g., the Plant Culture Catalogue and supplement 

(1997) a lso from Si g ma- Aldrich . Inc (St Louis/MQ) rSigm_ajj^CCS)^^g_a/^^ 

Patent Nos. 5,633,446, 5,317,096, 5,689,052, 5,159,135, and 5,679,558; Weising etal \ 
(1988)v4/7w. Rev, Genet. 22:421-477. Examples of suitable methods includQ Agrobacterium 
tumefaciens mediated transformation, direct gene transfer into protoplasts, microprojectile 
10 bombardment, injection into protoplasts, cultured cells and tissues or meristematic tissues, 
and electroporation. Microinjection techniques are known in. the art and well described in the 
scientific and patent literature. The introduction of DNA constructs using polyethylene 
glycol precipitation is described in Paszkowski et al-: (1984) EWBO 'J. 3:2717-2722, . 
Electroporation techniques are described in Fromm eta/. (1985) Proc\ Nal'L Acad Sci. USA 

15 82:5824. Ballistic transformation techniques are described in Klein et a/.. (1987) Nature 
327:70-73; these methods involve penetration of cells by small particles with the nucleic 
acid either within the matrix of small beads or particles, or on the surface. Ahhough typically 
only a single introduction of a new nucleic acid segment is required, this method particularly 
provides for multiple introductions. Transformation of monocots is known using various 

20 techniques including electroporation (e.g., Shimamoto et al. (1992) Nature 338:274-276, 
biolistics (e.g., European Patent Application 270,356); and Agrobacterium (e.g., Bytebier et 
al. (1987) Proc. Nat' I Acad Sci. USA 84:5345-5349). * 

Agrobacterium tumefaciens-mtdixeited transformation techniques are well 
described in the scientific literature. See] for example. Uovscfi et al. (19M) Science 233"496- 
25 498, and Fraley et a/. (1983) Proc. Nat'L Acad Sci, USA 80:4803. In these methods, a plant 
cell, an explant, a meristem or a seed is infected W\th Agrobacterium tumefaciens 
' transformed with the segment. Under appropriate conditions known in the art, the 

transformed plant cells are grown to form shoots, roots, and develop further into plants. The 
' insect resistance gene can be introduced into appropriate plant cells, for example, by means 
30 of the T-DNA-containing Ti plasmid of Agrobacterium tumefaciens. T-DNA of 

Agrobacterium is commonly used as a vector for introducing heterologous DNA into plants. 
Both binary and insertion vectors are known. See, e.g., European Patent 0 120 516, Hoekema 
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(1985) In: The binary plant vector system, OfFset-drukkerij Kanters B. V., AlblasserdamV 
Chapter 5; Fraley etaL, Crit. Rev: Plant Sci. 4: 1-46, An etal. (1985) EMBOJ. 4: 277-287. 
. The Ti plasmid is transmitted to plant cell's upon infection by Agrobactehum tiimefaciens, 
and is stably integrated into the plant genome (Horsch et al. ( 1 984) Science 233 :496-498; 
5 Yx^\Qyetal{\9%3)Proc, Nat'i Acad. Sci. USA ZO'AiO'i. , . 

Typically, the vector used to introduce the insect resistance gene into a plant 
' will include a selection marker. Selection markers confer on the transformed plant cells 
resistance to a biocide or an antibiotic, such as, for example, kanamycin, G 418, bleomycin, 
hygromycin, or chloramphenicol, or herbicide resistance, such as resistance to chlorsulfiiron 
1 0 or Basta. Examples of suitable coding sequences for selectable markers are: the neo gene 
which codes for the enzyme neomycin phosphotransferase which confers resistance to the 
antibiotic kanamycin (Beck et al (1982) Gene 19:327); the hyg gene, which codes for the 
enzyme hygromycin phosphotransferase and confers resistance to the antibiotic hygromycin 
(Gritz and Davies (1983) Ge/7C 25: 179); and the iar gene (EP 242236) that codes for 
15 phosphinothricin acetyl transferase which confers resistance to the herbicidal compounds 
■ phosphinothricin and bialaphos. 

Pathogens of the pest oan also be used to introduce an optimized pest 
resistance gene into the target pest. For example, foreign genes have been expressed in 
baculo virus (a yirus that infects* insects) in order to improve the viral performance as a 
20 sprayable insecticide. In one example, recombinant So/wiy-oc mori (silkworm) nuclear 
polyhedrosis virus (BmNPV) expressing an insect diuretic hormone gene effectively 
. disturbed the insect larval fluid . metabolism causing earlier death than the original BmNPV 
(Maeda(1989)5/oc/7em. jB/o/?/?v5. Res. Comm. 165: 1177-1183). A shuffled gene encoding 
any protein that can cause the host insect to die can be inserted into the baculo virus. Any 
25 pathogen of the target pest, not only viruses but also bacteria, fiingi, nematodes, etc., can be 
- used to introduce the shuffled insecticidal protein genes into the pest to: enhance their 
pathogenicity. 

As one example, shuffled Bt insecticidal protein genes are used. A membrane 
spanning portion of Bt crystal protein called 'T)omain F' is cloned from several cry 1 -type 
30 genes by PGR using proper sets of primers. These amplified genes are mixed and shuffled. 
The shuffled genes are then cloned into baculovirus (AcNPV) expression vectors including 
those containing an early stage promoter {e.g., plO, gp64) or a late stage promoter (e.g. 
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polyhedron) along with viral genome DNA. When the vector constructs are individually 
used to cotransfect Sf9 cells with viral DNA, which is cut open at the vector integration site, 
the recombinant viruses are obtained. The viruses propagated in Sf9 cells are tested in T, ni 
for speed of kill. One set of clones, which contain the shuffled Bt cry I Domain I under an 
5 early stage promoter, shows significant improvement in the kill speed. 
^ """Nematodes are also usefulTordelivery of aiTinsecticidal protein. Particularly, 

Sterimrnema spp. are suitable for this application, because they contain gram-negative 
symbiotic bacteria. In fact, these symbiotic bacteria do produce its own set of insecticidal 
proteins (Bowen oL (1998) Science, 280; 2129-2132). The insecticidal genes from 
10 ' Photorhabdiis luminescem can be shuffled to improve its specific activity and/or host 
specificity. When nematode carrying the symbiotic bacterium invades insect larvae, it 
releases the bacterium into the insect body cavity. The bacterium then grows in the insect 
and produces the insecticidal protein. 

Plant cells transformed with the optimized pest resistance genes can be 
15 regenerated to obtain intact plants that contain the transformed cells. See, fc?.g., European 

patent pubHcations 0,1 16JI8 and 0,270,822, PCT publication WO 84/02,913 and European 
patent application 87/400,544.0. The plants can form germ cells and transmit the pest 
resistance genes to progeny plants, which can be grown in a normal manner and crossed with 
other plants. Such regeneration techniques generally rely on manipulation of certain 
20 phytohormones in a tissue culture'growth medium, typically relying on a biocide and/or 

herbicide marker which has been introduced together with the shuffled nucleotide sequences 
Plant regeneration from cultured protoplasts is described in Evans et ai^ Protoplasts 
Isolation and Ctdture, Handbook of Plant Cell Culture, pp. 1 24- 1 76, MacMillan Publishing 
Company, New York, 1983, and Binding, Regeneration of Plants, Plant Protoplasts, pp. 21- 
25 73, CRC Press, Boca Raton, 1985. Regeneration can also be obtained from plant callus, 
explants, organs, or parts thereof Such regeneration techniques are described generally in 
Klee et al (1987) Ann, Rev. of Plant Phys, 38:467-486. To obtain plants that are 
homozygous for the improved gene, one can reproduce the plants and test those progeny that 
are resistant to the particular pathogen. 
30 The invention includes plants, plant parts, and plant cells that contain an 

optimized pest resistance gene such as those prepared using the methods described herein. 
Progeny and other descendents of such plants are also within the scope of the invention. 
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D, Introduction of Pest Resistance Genes into Insect Viruses 
The optimized pest resistance genes obtained using the methods described 
herein can also be introduced into viruses that infect pests. Introduction of a pest resistance 
gene into a virus can enhance the pathogenicity of the virus. Viruses that infect insects 
include, for example, baculoviruses and entomopoxrviruses. Methods for inserting genes into 
insect vimses are well known and readily practiced by those skilled in the art (see, e.g., 
Merryweather a/. (1990) J. Gen. Virol. 71: 1535-1544 and Martens a/. (1990)^/?/?/. 
Environmental Microbiol/56: 2164-2770: 

AUTOMATION FOR STRAIN IMPROVEMENT AND INTEGRATED SYSTEMS 

One aid to strain improvement is having an assay that can be dependably used 
to identify a few mutants out of thousands that have potentially subtle increases in product 
yield or insect resistance/toxicity activity. The Hmiting factor in many assay formats is the 
uniformity of library cell (or viral) growth. This variation is the sourcej Of baseline 
variability in subsequent assays. Inoculum size and culture environment 
(temperature/humidity) are sources of cell growth variation. Automation of all aspects of 
establishing initial-cultures and state-of-the-art temperature and humidity controlled 
incubators are useful in reducing variability. 

In one aspect, library members, e.g., cells, viral plaques, spores or the like, 
are separated on solid media to produce individual, colonies (or plaques). Using, an • 
automated colony picker (e.g., the Q-bot, Genetix, U.K.), colonies are identified, picked, and 
10,000 different mutants inoculated into 96 well microtiter dishes containing two 3 mm glass 
balls/well. The Q-bot does not pick an entire colony but rather inserts a pin through the 
center of the colony and exits with a small sampling of cells,, (or mycelia) and spores, (or , 
viruses in.plaque applications). The time the pin is in the colony, the number of dips to 
inoculate the culture medium, and the time the pin is in that medium each effect inoculum 
size, and each can be controlled and optimized The uniform process of the Q-bot decreases' 
human handling. error and increases the rate of establishing cultures (roughly 10,000/4 
hours). These cultures are then shaken in a temperature and humidity controlled incubator. 
The glass balls in the microtiter plates act to promote uniform aeration of cells and the 
dispersal of mycelial fragments similar to the blades of a fermenter. 

A high throughput method for detecting analyte molecules from a complex 
biological matrix is by electrospray tandem mass spectrometry as taught in "HIGH 
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THROUGHPUT MASS SPECTROMETRY' by Sun Ai Raillard, USSN 60/1 19,766, filed 
02/1 1/1999. In the *766 application, methods which utilize off-line parallel sample 
purification and fast flow-injection analysis, typically reducing the time of analysis to 30 to 
40 seconds per sample. 

5 • Generally, all steps starting from cell picking, cell growth, sample preparation 

and analysis are automated and can be carried out overnight by various robotic workstations. ; 
A number of well known robotic systems have also been developed for solution phase 
chemistries useful in assay systems. These systems include automated workstations like the 
automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, 

10 Japan) and many robotic systems utilizing robotic arms (Zymate II, Zymark Corporation, 
Hopkinton, Mass. ; Orca, Hewlett-Packard, Palo Alto, Calif ) which mimic the manual 
synthetic operations performed by a scientist. .Ajiy of the above devices are suitable for use 
with the present invention, e.g., for high-throughput screening of molecules assembled from 
the various oligonucleotide sets described herein. The nature and implementation of ' 

15 modifications to these devices (if any) so that they can operate as discussed herein' with 

, ^ reference to the integrated system will be apparent to persons skilled in the relevant art. 

High throughput screening systems are commercially available (see, e,g,, 
Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman v^.- 
Instruments, Inc. Fullerton, CA, Precision Systems, Inc., Natick, MA, etc). These systems 

20 typically automate entire procedures including all sample and reagent pipetting, liquid 

dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate \ 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization: The manufacturers of such systems - • 
. provide detailed protocols the various high throughput. Thus, for example, Zymark Corp. 

25 provides technical bulletins describing screening systems for detecting the modulation of 

gene transcription^ ligand binding, and the like. A variety of commercially available 

peripheral equipment and software is available for digitizing, storing and analyzing data, 

e.g., using PC (Intel x86 or Pentium chip- compatible DOS™^ 0S2™ WINDOWS™, 

WINDOWS NT™ or WINDOWS95-98™ based machines), MACINTOSH™, LINUX, or 

30 UNIX based (e.g., SUN™ work station) computers. 

Integrated systems for assay analysis in the present invention typically 

include a digital computer with e.g., high-throughput liquid control software, data 



wo 99/57128 PCT/US99/08473 

digitization software, data interpretation software, a robotic liquid control armature for ^^..^ 
transferring solutions fi-om a source to a destination operably linked to the digital computer, 
an input device (e.g., a computer keyboard) for entering data to the digital computer to 
control high throughput liquid transfer by the robotic liquid control armature^' an image 
5 scanner for digitizing signals from assay components and the like. ■ 

Of course, these assay systems can also include integrated systerns 
incorporating nucleic acid selection elements for screening, such as a computer, database 
with nucleic acid sequences of interest, sequence alignment software and the like. In 
addition, this software can include components for ordering selected oligonucleotides (e.g., 
10 used in oligonucleotide mediated shuffling of insect resistance genes), and/or^directing 
synthesis of oligonucleotides or genes by an operably linked oligonucleotide synthesis 
' machine. Thus, the integrated system elements of the invention optionally include any of the 
above components to facilitate high throughput recombination and selection- It will be ^ ^ 

appreciated, that these high-throughput recombination elements can be in systems separate 
, ■ 15 • from those for performing selection assays, or the two can be integrated. 

In the high throughput assays of the invention, it is possible to screen up to 
several thousand different shuftled variants in a single day. In particular, each well of a 
microtiter plate can be used to run a separate assay, or, if concentration or incubation time 
effects are to' be observed, every 5-10 wells can test a single variant. Thus, a single standard 
20 microtiter plate can.assay about 100 (e.g., 96) reactions. If 1536. well plates are used, then a 
single plate can easily assay, from about 100- about 1500 different reactions. It is possible to 
assay several different plates per day; assay screens for up to about 6,000-20,000 different 
, assays (i.e., involving different nucleic acids, encoded proteins, concentrations, etc.) is 
possible using the integrated systems of the invention. More recently, microfluidic . 
25 approaches to reagent manipulation have been developed, e.g., by Caliper Technologies . ^ 

(Mountain View, CA). ; ' 

EXAMPLES ' 

The following examples are offered solely for the purposes of illustration, and 
are intended neither to limit nor to define the invention. 

30 EXAMPLE 1: OPTIMIZATION OF CRYl TOXIN BY DNA SHUFFLING 

The crylC gene, including its own promoter (5' region up to -260 nt), is used 
as the substrate for DNA shuffling. After DNA shuffling, the protein coding region is cloned 
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into an expression vector, and E. coli cells are transformed. The transformed cells are 
incubated in a bacterial culture medium (nutrient broth) at 30°C for 72 hr, after which the 
cells formed inclusion bodies consisting of the CrylC protein. The cells are then harvested 
by either centrifugation or filtration and lysed with lysozyme to release fi-ee inclusion body. 
5 Alternatively, lysis can be achieved by treatment with a detergent, sonication, or other 

___JBethpds kjiownJojhose.of„skilL — 

centrifugation or filtration and exposed to an alkaline solution (pH 10.5) with or without a 
disulfide bond reducing agent {e.g., 2-mercaptoethanol). The CrylC protein dissolved in the 
alkaline solution is then activated by trypsin. Trypsin digests the CrylC protein down to the 
10 66 kDa core. This trypsin digested core, which is the active form of Cryl-type Bt insecticidal 
proteins such as CrylC,, is purified with DEAE ion exchange resin. The activated CrylC 
protein is absorbed onto DEAE ion exchanger at pH i0.5 and then eluted with salt such as 
sodium chloride or ammonium acetate. Ammonium acetate is particularly desirable because 
it can be evaporated during the subsequent concentration process. The activated protein is 
15 then concentrated either by lyophilization or evaporation under vacuum and used in 

screening. All the protein isolation processes described above are done in 96-well plates by 
high throughput format using a robot. A robot which is designed for DNA/RNA isolation is 
modified to use for this purpose. 

The cry IC gene is shuffled with other cry genes that are homologous to 
20 crylC. To obtain the homologous genes, two oligonucleotide primers are synthesized based 
on the CrylC 5' regipns that contain the ribosome binding site and the trypsin activation site 
(approximately 1800 nucleotides into the CrylC protein coding region). These primers are 
used to amplify the toxic portion of previously unknown cry genes from a B. thuririgiensis 
isolate. Normally, a B. thiiringiemis strain contains multiple cry genes (as many as seven or 
25 more) and these genes are often reasonably similar in sequence lo cry IC From one B. 

thitringiensis isolate, four cry genes are amplified. The amplified clones are cloned into E. 
coli^ and selected clones are tested for sequence diversity by restriction mapping. For 
mapping, restriction enzymes that have a 4 bp recognition sequence (e.g., Sau3A) are used. 
Those cloned genes having restriction maps that are similar to, but substantially different 
30 fi-om, that of crylC are selected for shuffling with cry 1 C. Alternatively, the cloned cry genes 
are analyzed for diversity by multiple primer PCR analysis as described in Kalman et al. 
(1993) AppL Environ. Microbiol. 59: 113 1-1137. 
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' After DNA shuffling, host cells (£. co// or a bacillus) sometimes failed 
produce the ftill length Cry proteins. This is due to undesirable mutations which make the 
Cry protein unstable even in K coli cells. Unstable mutants of the Cry protein are normally 
• inactive in insects, because insects can digest the proteins into non-active fragments. ■ 
5 Therefore, it is desirable to preselect those unstable mutants. In order to find those which . 
failed to produce the Cry protein, an immunoassay {e.g., ELISA) is performed. An 
antiserum made against a Crterminal portion of the Cry protein is used. When the Cry 
protein is not formed as a full length stable protein {i:e , 135 kDa), the antiserum made 
against the C-terminal Cry protein failed to react. The antiserum directed towards the 
10 C-terminal portion can be made by absorption of an antiserum .which had been made against 
the fiill length Cry protein with an truncated Cry protein\vith its C-terminus missing. 
Alternatively, the C-terminus can be tagged with a common marker, such as histidine 
residues. Another alternative analysis method involver. -ubjectihg the mutant Cry proteins to 
• SDS-PAGE. ^ . ; , ' * 

,15 EXAMPLE 2:SHLTFLrNG QF INSECTICID.AL TOXIN GENES OF BACILLUS \ 
POPILLIAE 

Bacillus popilliae, which is known to be a pathogen of scarab beetles such as 
\ Japanese beetle, produces ah insecticidal protein called Cry 18 Aa (Zhang e/ a/. (1997)/ 
, , ' Bacterial . 179: 4336-4341). The insecticidal activity of this protein is not sufficiently high, 
20 however, for large-scale use- to prevent crop damage caused by beetle infestation. This 
Example describes the optimization of Cry 1 8Aa by shuffling the cryl 8 Aa gene of B. 
, popilliaemd cryl, which is its homologous gene of B. thuringiensis. 

The cry 18 Aa gene is amplified by polymerase chain reaction (PCR) from 5. 
/7o/7/7//ae using two primers, . which are designed according to the published sequence 
. 25 (GenBank accession number: X99049). The forward primer (5'-gaaggaggctattggCCatgGac- 
3') is based on the sequence around the ribosome binding site and translation start signal 
; The sequence is modified as indicated with capital letters to include an-MroI site at the 
translation start site. The reverse primer (5 AT ATGG ATCCTT AGTG ATGGTGATG 
GTGATGataaagaggagtgtcatctgc-3') is based on the sequence around the translation 
30 termination. This primer includes a coding sequence for six consecutive histidine residues 
and a 5amHI restriction site' (capital letters) at the end of the cry 1 8 Aa protein-coding region. 
The His tag is later used to purify the proteins produced by E. coli cells that contain the 
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shuffled genes. The amplification is made from the lysed B, popilUae cell by using a 
standard PGR method as described in the case of cr>^2 genes below. 

Several different gene libraries are produced by DNA shuffling between the 
cloned crylSAa gene and its homologous genes. The cry2 genes' of 5. thuringiensis are 
5 known to be homologous to B, popilliae crylSAa gene. The known cry2 genes are amplified 

by-PGR-from-several-strains of-J?-//n/r/>?g7'e/;5/5-(er^7"Bt-kurst 

thuringiensis cells are lysed in a PGR tube at 100°C and used as a template. The cry2 genes, 
are amplified by PGR using a standard PGR protocol with appropriate primers that are 
designed basied on published cry2A sequences {e.g., GenBank accession Nos: M3 1738, 
10' M23724, X57252, etc.). Additional genes homologous to crylSAa are cloned and shuffled 
with crylSAa. Genomic libraries of several B. thuringiensis and B, popilliae strains are 
screened with the cloned crylSAa gene by Southern hybridization. In order to make the 
genomic libraries, DNA from B. thuringiensis and popilliae is partially digested with 
SauZA to produce 1-10 kb fragments. Fragments of about 4 kb (3 to 5 kb range) aie isolated 
15 by gel electrophoresis and cloned in pBluescript (Stratagene). Several cryl SAa-homologous 
genes are cloned from various B, popilliae isolates and B. thuringiensis strains such as Bt 
kurstaki, Bt kenyae and Bt tolworthi subspecies. 

The protein coding region of the shuffled genes is amplified by PCPv and 
cloned into an expression vector as described by Sasaki etai ((1996) Curr, Microbiol. 31, 
20 195-200). For high expression in coli, a portion of the cry promoter between the Apal and 
Ndel sites is removed from the original vector described by Sasaki et aL E. coli as well as 
cry" 5. thuringiensis are transformed with the vector containing the shuffled genes The 
transformants are screened by immunoassay with anti-6X-His-antiserum for the production 
of the insecticidal protein, and positive clones are saved for the screening as descr ibed 
25 below. 

When shuffled cry genes are expressed in E. .coli, the cells typically produce 
the toxin polypeptide as an inclusion body. The inclusion bodies are liberated by dissociating 
E. coli cells with a determent such as B-PER Bacterial Protein Extraction Reagent (Pierce) 
according to the manufacture's recommended procedure. The detergent is removed by 
30 filtrafion, and the inclusion body is dissolved with 0.02N NaOH. After pH of the solution is 
neutralized with 100 mM Tris-HGl, pH 8, the insecticidal protein encoded by the shuffled 
gene is purified by Ni-NTA agarose (Qiagen) in a 96-well filter plate. A sufficient amount of 
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E. coli cells is used to produce an amount of the insecticidal proteins, which always exceed 
the "capacity ofNi-NTA agarose regardless of the. expression level. This is to obtain a 
roughly equal amount of the protein from each 96 wells. 

The proteins produced by the shuffled genes are placed on insect diet and 
5 allowed to be consumed by cucumber beetle. Mortality is observed to assess the activity 
level of each protein sample. In order to increase the screening efficiency, 1 0 protein 
samples are pooled and tested for the activity. The amount of protein used in each test is 
reduced to a sublethal dose, which is determined with the wild-type CrylSAa protein. 
Pooled samples showing some insect mortality are decoded into 10 individual components to 
10 - pinpoint a sample or samples responsible for the mortality. Positive samples are selected for 
second round of shuffling. . 

Several rounds of shuffling are performed for substantially increased potency 
of the 5. /?o;?///7ae GrylSAa insecticidal protein. 

EXAMPLE 3 : CLONING OF PREVIOUSLY UNKNOWNS GENES FROM I NSECT 
15 . PATHOGENS THAT ENCODE INSECTICIDAL PROTER^ 

Genomic DNA is prepared from several insect pathogens such as 
Psevdomonas aeruginosa and Scrratia entomophila. The DNA samples are digested with 
several enzymes, including NotL BamYW and SphV The fragments produced with these 
enzymes are fractionated by size and cloned in a cosmid vector, e.g., Supercos (Stratagene), 
20 or a lambda vector, e.g., Lambda Zap (Stratagene) depending on the size. £. coli libraries 
containing insect pathogen DNA are then screened for insecticidal activity using tomato 
homworm and cucumber beetles. £. co// cells are cultured in LB broth for 48 hr at SO^'C and 
harvested by centrifugation. The precipitated cells are resuspended in a minimum amount of 
water and placed on insect diet. Insects are allowed to feed on this diet for 3 days. Several 
25 cosmid clones showing insecticidal activity are identified, and DNA is isolated. 

The cosmid DNA from those cells that have insecticidal activity is partially 
digested with Sau3K to obtain fragments of about 4 kb. The fragments are end^repaired with 
Klenow and cloned into the Smal site of pBluescript (Stratagene). After screening about 
4000 pBluescript subclones from one insect pathogen, several clones showing insecticidal 
30 activity are typically obtained. These positive clones are used as probes to screen by 
Southern hybridization to find homologous insecticidal genes within the same genus. 
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Homologous genes from Pseudomonas and Serratia species showing 
insecticidal activity are combined in two groups and shuffled for higher activity as described 
in this invention. The shuffled genes are cloned in E. coli and selected for higher insecticidal 
activity as described in Example 2 for B, popilliae Cry 1 8 Aa. 

EXAMPLE 4: TOXINS WITH IMPROVED ACTIVITY AGAINST CORN ROOTWORM 
-0BTAINED-BY-DNA-SHUFFLING — — — 



This Example describes a method by which a family of homologous genes are 
shuffled to obtain*toxins that exhibit improved activity against com rootworm. Several sets 
of Bt cry genes are shuffled. A number of Bt Cry proteins are said to be active against 

10 beetles {e.g., cry3Ba, cry3Bb, cry3Aa, cry3Ca, crylla, cry lib, crylBc, crylBb, crylBa, . 
crylKa, cryTAa, cry7Ab, crySAa, crySBa, crySCa, cr>^9Da, cr\^2Aa, crY2Ab, crylSAa and 
cry 1 4 Aa). Unfortunately the toxins encoded by these genes are Icnown to be inactive or 
weakly active against corn rootworm, thus indicating that they are good candidates for DNA 
shuffling. When their sequences are compared, we nnd that they can be grouped by sequence 

15 homology in 4 families. The family 1 includes cry3Ba, cry3Bb, cr\'3 Aa and cry3Ca; the 

family 2 includes crylla, cryllb, crylBc, crylBb, crylBa and cry IKa; the family 3 includes 
cry7Aa, cryTAb, crySAa, crySBa, crySCa and cry9Da, and the family 4 includes cry2Aa, 
cry2Ab, crylSAa and cryl4Aa. These genes can be amplified by PGR from appropriate Bt 
' strains. Or, new, undisclosed genes can be cloned from Bt by screening Bt isolates by ' 

20 Southern blotting using a DNA probe synthesized based on any of these published 
sequences. 

Each of the families are individually shuffled. Since they all are active against 
^ beetles and _some (e.g. cry3B_b) are active against com rootworm, one can identify shuffled 

genes that encode toxins having improved activity against com rootworm. Shuffling, gene 
25 expression, protein isolation, and screening are essentially done by the methods described 

herein. * 

EXAMPLE 5: TOXINS WITH IMPROVED ACTIVITY AGAINST NEMATODES 

In this Example, a set of cry genes are shuffled to obtain genes that encode 
toxins having increased activity against nematodes. Genes that are shuffled include Bt 
30 crySAa, crySAb, crySAc, cry6Aa, cry6Ba, cryl2Aa, cryl3 Aa and cry21 Aa. They can be 
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grouped and shuffled as described above. Toxins encoded by the shuffled genes are tested 
for activity against the target nematodes. 

EXAMPLE 6: USE OF NEMATODES FOR INTRODUCING AN OPTIMIZED GENE 
INTO A PEST " 

This Example describes one method of using nematodes to introduce a 
shuffled gene into an insect. Cyt genes from Bt are shuffled for better cytolytic activity. Cyt 
proteins of Bt are known to recognize specific phospholipids on insectxells and insert the 
molecule into cell membrane to disrupt the membrane function. Its mode of action 
substantially differs from that of Cry proteins and also from that of Bt. There are several, 
analogs within the cyt gene family (e.g., cytlAa, cvtlAb, cytlBa, cyt2Aa, cyt2Ba and 
cyt2Bb). Some of these genes, including cyt i Aa, cytlBa and C'n2 Ba, are cloned from 
appropriate Bt hosts using PCR techniqueii as describ^'d aerein. The cloned genes are mixed 
and shuffled. The shuffled gene^ are cloned in z Bacillus oxpression vector as described by 
Sasaki et al, ((1996; Curr. Microbiol. 3 i ]93-20(J} ^nd used to transform a cry-negative Bt 
strain. C\t proteins expressed in Bt are tested for cytoioxici'Ly using StP cells. 

Those clones that exhibit improved cytotoxicity can be introduced into 
Xenorhabdiis lumifiescens (a symbiotic bacterium of an insecticidal nematode). Cyt genes 
are amplified from Bt clones showing improved cytotoxicity with primers made on vector 
portions, and the amplified genes are cut with one or more appropriate restriction enzymes* to 
release the coding region and portions of flanking regions: This fragment is cloned into 
pTZ19R with the 20-kDa protein gene associated with cytl A in Bt israelensis and used to 
ixdLmfQvm Xenorhabdus luminescens. This 20-kDa protein preserves the viability of host 
cells ahd promotes expression of the shuffled cyt genes QNuetal. {\991) J. Bacterial. 175: 
5276-5280). Recombinant X. lumimscens is cocultivated with nematode, iS7e//7erne/7?a 
glaseri. When tested against scarab beetles, it is found that the nematode harboring the 
recombinant X. lumimscens requires a much lower dose to kill the insect than that the 
nematode with non-recombinant X lumimscens. 

EXAMPLE 7: OPTIMIZATION OF A PROTEASE INfflBITOR GENE 

A cysteine protease inhibitor gene is amplified by PCR from com c-DNA 
utilizing a reported DNA sequence (GenBank: D38130). There are a number of homologous 
genes found in rice, sorghum, cowpea, soybean, cabbage, potato, etc. DNA encoding a 



wo 99/57128 PCT/US99/08473 

81 

portion (from 25 aa to 100 aa) of rice, soybean and cabbage cysteine protease inhibitor genes • 
is synthesized. These synthesized genes are mixed with the corn inhibitor gene and shuffled. 
The shuffled genes are then cloned in an E. coli expression system, pQE-60, from Qiagen. 
The shuffled genes are then expressed, and proteins are purified with Ni-NTA agarose in 96- 
5 well plates. Purified proteins are then tested for their protease activity using crude 

"preparation'ofxysteine'protease'prepared-from-^white-grufas—A'ctiveljrfeeding white^g^ 

are collected in the field and homogenized. After cell debris is removed by centrifugation, 
the supernatant is used as the protease preparation without further purification. The grub 
protease preparation is mixed with shuffled inhibitors and incubated for 20 min. The 

10 protease activity is determined by fluorescent assay using Enzchek from Molecular Probe. 
Enzchek utilizes fluorescent dye-labeled protein in which the dye molecules are arranged in 
the way that fluorescence is quenched. When protease digests the protein, the dye becomes 
fluorescent. A large number of shuffled inhibitor clones are. identified as active by the 
protease assay. Those active clones are screened for insecticidal activity by Agrobacterium 

15 rhizogenes method as described in this invention. 

EXANffLE 8: CYTOTOXICITY AS SAY i 

Insecticidal proteins including those described in this invention are often 
cytotoxic. For example, Bt Cry and Cyt proteins are known to kill cultured insect cells when 
they are properly activated. In the examples below^ we describe methods we used to screen 

20 shuffled insecticidal gene products. 

When disrupted by the insecticidal proteins, the insect cells release a 
substantial amount of ATPase. The ATPase activity in the supernatant can be used as an 
" indicator of the cytotoxicity of an insecticidal protein. The shuffled Bt^Cry proteiris'that^- 
have been tagged with 6X-His are purified with Ni-NTA agarose as described before. The 

25 purified proteins are then digested with 1/100 volume (w/w) trypsin for 30 min to activate 
the protein. Several Lepidoptera insect cell lines, such as Sf9 and TN368 (Trichoplusia ni) , 
are used. The trypsin-activated Cry proteins are mixed with the cells in 96-well plate at 0.1 
to 1 ppm and incubated for 60 min. After the incubation, the cells are removed by filtration 
and ATPase activity is measured by luciferase-luciferin assay (Sigma). This ATPase method 

30 is more sensitive than other nlethods such as dye exclusion method in which the cell death is 
determined by staining with a dye like trypan blue. Dead cells are stained with trypan blue 

whileJiye_cells_are_not. i , 
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EXAMPLE 9: SHUFFLING THE BT CRY GENE 

In order to increase the diversity of the shuffled gene librnry, a Bt cry gene or 
genes (called the primary genes) are shuffled using synthetic oligonucleotide shuffling {See, 
Crameri et al. "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACED 
5 RECOMBINATION" filed February 5, 1999, USSN 60/118,813). In brief, a family of 
homologous insect resistance nucleic acid sequences are first aligned, e.g. using available 
computer software to select regions of identity/ similarity and regions of diversity. A 
plurality (e.g., 2, 5, 10, 20, 50, 75, or 100 or more) of oligonucleotides corresponding to at 
. least one region of diversity are synthesized. These oligonucleotides can be shuffled 

10 directly, or can be recombined with one or more of the family of nucleic acids. 

The oligonucleotide sequence can be taken from other genes called secondary 
genes. The secondary genes have a certain degree of homology to the primary genes. There 
are several ways to select parts of the secondary gene for the oligonucleotide synthesis. For 
example, portions of the secondary gene can be selected at random. The DNA shuffling 

15 process will select those oligonucleotides, which can be incorporated into the shuffled genes. 
The selected portions can be, any lengths as long as ihey are;suitabie to. synthesize. The 
oligonucleotides can also be designed based on the homology between the primary and 
secondary genes. A certain degree of homology is necessary for crossover, which must 
occur aniong DNA fragments during the shuffling. At the same, time, strong heterogeneity is 

20 desired for the diversity of the shuffled gene library. Furthermore, a specific portion of the 
secondary genes can be selected for the oligonucleotide synthesis based on the knowledge in 
the protein sequence and function relationship. A large number of reports (extensively cited 
in a review article: "Bacillus thuringiensis.and its pesticidal crystal proteins.", Schnepf, E. 
et.al., 1998, Microbiology and Molecular Biology Reviews, vol 62, page 775) indicate that 

25 * the "domain IT' which is normally; the middle portion of the fully activated Bt crystal 
proteins- is important for Bt activity. ' 

■ " In the case of Cry 1 A- type proteins, domain II starts at about the 200th amino 

acid resides and ends at about the 410th residue. This domain was found to be important for 
' the insect specificity of the Bt toxins. When the insect specificity is modified by the current 

30 invention utilizing the DNA shuffling technology, the domain II portion of the nucleotide 
sequence of the secondary genes can be selected as a target region for synthesizing 
oligonucleotides used in an oligonucleotide shuffing procedure. 
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Domain I, which is the N-terminal portion of the ftilly activated Bt crystal 
protein proximal to domain II,, is involved in the membrane spanning function (see the 
review of Schnepf et al.) of Cry. Since the insecticidal activity of the Bt crystal protein is, at 
least in part, dependent of this function, the domain I portion of the secondary genes can be 
5 selected for oligonucleotide shuffling for increased insecticidal activity. Domain III, which 

is the C-terminal portion of the fully activated Bt crystal protein after domain II, can also be 

selected for the oligonucleotide synthesis. This domain is occasionally involved in the insect 
specificity (see Schnepf et al.). 

In one aspect, the primary cry2Aa and cry2Ab genes were shuffled with 
10 several oligonucleotides that were synthesized based on the secondary cry2Ac gene 

sequence. Cry2Aa and cry2Ab are highly homologous, but cry2Ac is substantially different 
from these genes (see, e.g.. Figure 3). Therefore, it was desirable to shuffle cry2Ac along 
with the cry2Aa and cr\'2 Ab to increase the diversity of resulting shuffled recombinant 
nucleic acids. Portions of the cry2Ac sequence, which are substantially different from the 
1 5 corresponding portions of cry2 Aa and cr\^2Ab, were selected, and a series of 50-mer 

oligonucleotides that cover these portions were synthesized. These oligonucleotides were 
shuffled with the protein-coding region of cry2Aa and cry2Ab. When a certain number of 
the. clones were selected from the shuffled gene library and examined for the diversity by 
restriction mapping, good diversity was observed. The diversity was more than normally 
20 expected from the shuffling of cry2Aa aind cry2Ab alone. ' 

Alternatively, a portion of the secondary genes can be obtained by PGR 
amplification. The PGR amplified UNA can be shutTled with the primary genes. The 
selection criteria mentioned above for the oligonucleotides can be applied to the PGR 
. -'amplification.. The portions to be amplified can be. randomly selected. Or,, the selection can . . 
25 be based on the sequence homology and heterogeneity. Also, the selection can be made 

based on the seqeunce and function relationship. The PGR amplified portions can be domain 
I for higher insecticidal activity or domain Elm for different insect specificity. Like 
synthesized oligonucleotides, the PGR amplified portions of the secondary genes can be 
shuffled with the primary genes. 

30 EXAMPLE 10: HIGH-THROUGHPUT SGREEN FOR INSEGTIGIDAL ACTIVITY 

This example provides an example high throughput strategy for obtaining 
new insecticidal genes and proteins. First, the nucleic acids of choice (e.g., Bt genes or gene 
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fragments) are recombined. The resulting recombinant nucleic acids. are transformed into a 
strain o{Bacillms thurmgensis that expresses the recombined nucleic acids in an active ■ 
protein form. Colonies are picked with the Q-bot as described snprd. Optionally, pools of 
transformed! cells are grown in each well to increase the number of colonies which are 
5 screened in the initial screening round. For example,, screening 1 00 colonies^ in a well for 
10,000 wells provides a screen of 10^ colonies. 

Sp.omlation is induced in a'standard 96 (or more) well format. Several larvae 
are added to each well. The plate is covered with an air permeable membrane which. retains 
the larvae in the wells in which they were placed. Larvae are allowed to feed until they 

10 receive a lethal dose from any spores expressing an insecticidal protein. The larvae are 
moved to an incubation chamber and allowed to mature into insects. Mature insects, fly 
passively away, e.g., by using a chemoattractant, or chemorepellant. All of the dead larvae 
are harvested. The larvae contain insecticidal spores (there are typicaliy some false positives 
at this stage. due to larvae that die due to experimental mariipulations, rather than insecticidal 

15 proteins). The DNA from the larvae are recovered and the shuffled genes are recowefed by 
PCR. The genes are recloned and the process repeated (e.g., by limiting dilution ,of different 
' positive clones) to further enrich tor insecticidal proteinS: A library of such genes enriched 
for insecticidal activity is constructed. This library can be screened, shuffled and otherwise ■ 
manipulated by any of the techniques discussed herein. 

20 Thus, this example utilizes the ability of a bolus, of spores encoding a shuffled 

Bt gene to kill larvae. The enrichment is based on separating dead larvae.from larvae that 
ingest innocuous shuffled Bt toxins Bt genes are recovered and the process is repeated. 

In related aspects, this assay could be adapted to bateriocidal or fungicidal 
proteins by infecting bacteria or fungi with shuffled genes and separating live cells from 

25 dead cells, e.g., by F ACS. „ ^ . , 

Modifications can be made to the method and materials as hereinbefore , 
described without departing from the spirit or scope of the invention as claimed, and the 
invention can be put to a number of different uses, including: 
30 The use of an integrated system to test insect resistance of shuffled DNAs, 

including in an iterative process. The integrated system typically includes a computer with 
software directing manipulation of fluids and cells as described above for assays directed to 
assessing insect resistance or toxicity. 
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An assay, kit or system utilizing a.use of any one of the selection strategies, 
materials, components, methods or substrates hereinbefore described. Kits will optionally 
additionally comprise instructions for performing methods or assays, packaging materials, 
one or more containers which contain assay, device or system components, or the like. 

In an additional aspect, the present invention provides kits embodying the 
methods,and.apparatus-herein._.Kjts_of-the invention,optionany_comprise-one^or_more-of-^^^ 
following: (1) a shuffled component as described herein; (2) instructions for practicing the 
methods described herein, and/or for operating the selection procedure herein, (3) one or . 
more insect resistance or toxicity assay component; (4) a container for holding insecticidal ' 
proteins, nucleic,acids, plants, insects, cells, or the like and, (5) packaging materials. 

In a further aspect, the present invention provides for the use of any 
component or kit herein, for the practice of any method or assay herein, and/or forihe use of 
any apparatus, composition, librar\- or kit to practice any assay or method herein. 

While the foregoing invention has been described in some detail for ourposes 
of clarity and understanding, it will be clear to one skilled in the art from a reading of this 
disclosure that various changes in form and derail can be made without depaning from the 
true scope of the invention. For example, all the techniques and materials described above 
can be used in various combinations All publications and patent documents cited in,this 
application are incorporated by reference in their entirety for all purposes to the same extent 
as if each individual publication or patent document were so individually denoted. 
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' 1 . A method of obtaining an optimized recombinant pest resistance gene 
which can confer resistance to a pest upon a plant in which the gene is expressed, the method 
comprising: ' / \ 

(1) recombining a plurality of forms of a nucleic acid which comprise 
segments derived from a gene which can cpnfer upon a plant resistance to a pest, wherein the 
plurality of forms of the nucleic acid differ from each other in two or more nucleotides, to 
produce a library of recombinant pest resistance genes; and 

(2) screening the library to identify at least one optimized recombinant 
pest resistance gene that exhibits improved pest resistance capability compared to a non- 
recombinant pest resistance gene. 

2, The method of claim 1, wherein frie method fjrther comprises: 

(3) reconibining at least one optimized Tecoinbinant pest resistance 
gene with a further form of the pest resistance gene, which is the same or different from one 
or more of the plurality of nucleic acid : jrms of (1), to produce a further library of 
recombinant pest resistance genes; ' ' ' 

(4) screening the further library to identify at least one further 
optimized recombinant pest resistance gene that exhibits a further improvement jn pest 
resistance capability compared to a non-recombinant pest resistance gene; and 

(5) repeating (3) and (4), as necessary, until the further optimized 
recombinant vector module that exhibits a further improvement in pest resistmice capability 

: compared to a non-recombinant pest resistance gene. 

3. The method of claim 1, wherein the improvement in pest resistance > 
capability comprises increased potency against the pest. . 
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1 4. The method of claim 1, wherein the plurality of forms of a nucleic acid 

2 comprises one or more nucleic acid derived from or corresponding to one or more of cryl Aa, 

3 cryl Ab, cry 1 Ac, cry 1 Ad, cryl Ae, crylAf, crylAg, crylBa, crylBb, crylBc, crylBd, crylCa, 

4 crylCb, crylDa, crylDb, crylEa, crylbe, crylFa, cr>aFb, crylGa, crylGb, crylHa, crylHb, 
_J cry lla, cryllb, cryllc, cryl Ja , crylJb, c rylKa, cryl Jc, cry2Aa, cry2Ab, cry 2Ac, cry3Aa, 

6 cry3Ba, cry3Bb, cry3Ca, cry4Aa, cry4Ba, crySAa, crySAb, crySAc, crySBa, cry6Aa, cry6Ba, 

7 cry7Aa, cry7Ab, crySAa, crySBa, cry8Ca, cryPAa, cry9Ba, cry9Ca, cry9Da, cry9Ea, crylOAa, 

8 cryllAa, cryllBa, cryllbb, cryl2Aa, cryl3Aa, cryl4Aa, crylSAa, cryl6Aa, cryl7Aa, crylSAa, 

9 cryl9Aa, cry20Aa, cry21 Aa, cry22Aa, cry23Aa, cry24Aa, cry25Aa, cry26Aa, cry27Aa, 
10 cry28Aa, cytlAa, cytlAb, cytlBa, cyt2Aa, cyt2Ba, cyt2Bb. • . 

^ 5. The method of claim 1, wherein the nucleic acid comprises one or more 

2 nucleic acid selected from; crylAal, cryl Aa2, crylAa3, cryl Aa4, cryl Aa5, crylAaS, cr\'l Abl, 

3 cryIAb2, crylAb3, crylAb4, crylAbS, crylAb6, cryi Ab7, crylAbS, crylAb9, crylAblO, 

4 crylAcl, cr>'lAc2, crylAcS, crylAc4, crylAc5, crylAc6, cryl Ac7, cry! AcS, cryl Ac9, 

5 crylAclO, crvTAdl, crylAel, cryl Afl, crylBal, cr\'lBa2, crylBbK cr\!lBci;cr>'lBdU 

6 crylCal, crylCa2, crylCa3, crylCa4, crylCa5, crylCa6, crylCa7, crylCbl, crylDal, crylDbl, 

7 crylEal, crylEa2, crylEaS, crylEa4, crylEbl, crylFal, crylFa2, crylFbl, crylFb2, crylGal, 

8 crylGa2, crylGbl, crylHal, crylHbl, cryllal, crylla2, crylla3, cry na4, cryllaS, ciyllbl, 

9 cryllcl, cryl Jal, crylJbl, crylKal, cry2Aal, cry2Aa2, cry2Aa3, cry2Aa4, cry2Abl, cry2Ab2, ' 

10 cry2Acl, cry3Aal, cry3Aa2, cry3Aa3„cry3Aa4, cry3Aa5, cry3Aa6, cry3Bal, cry3Ba2, 

1 1 cry3Bbl, cry3Bb2^ cry3Cal, cry4Aal, cry4Aa2, cry4Bal, cry4Ba2, cry4Ba3, cry4Ba4, cr/5Aal, 

12 crySAbl, cry5Acl, crySBal, cry6Aal, cry6Bal, cry7Aal, cry7Abl, cry7Ab2, cry8AaU 

13 crySBal, cry8Cal, cry9Aal, cry9Aa2, cry9Bal, cry9Cal, cry9Dal, cry9Da2, cry9EaK 

14 crylOAal, cryllAal, cryllAa2, cryllBal, cryllBbl, cryllBbl, cryl2AaU cryl3AaU 

15 cryMAal, cryl5Aal, cryl6Aal, cryl7Aal, cryisAal, cryl9Aal, Cryl9Bal, cry20Aal, 

16 cry21Aal, cry22Aal, cry24Aal, cry25Aal, cry26Aal, cry28Aal, cytlAal, cytl Aa2, cytlAa3, 

17 cytlAa4, cytlAbl, cytlBal, cyt2Aal, cyt2Bal, cyt2Ba2, cyt2Ba3, cyt2Ba4, cyt2Ba5, cyt2Ba6, 

18 cyt2Bbl, 40kDa, cryC35, cryTDK, cryC53, vipl A, vip2A, vip3A(a), vip3A(b), and p21med. 
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1 , f^- The method of claim 1 , wherein the improvement in pest resistance 

2 capability comprises an increase in the range of pests that are susceptible to the pest 

3 , resistance gene. « . 

.1 ■.. , 

1 ; 7. The method of claim 1, whereiathe improvement in pest resistance 

2 , capability comprises an decreased ability of a pest population to develop resistance to the 

3 pest resistance gene! . . 

1 ■ 8. The method of claim 1, wherein the improvement in pest resistance 

2 capability comprises an increased expression level of a polypeptide encoded by the pest 

3 resistance gene. - 

1 . r 9. The method of claim 8, wherein the optimized recombinant pest " 

2 . resistance gene comprises an increase in G-C content compared to a naturally occurring form 

3 ■ of the pest resistarice gene. * ' 

1 10*. The.method of claim 1, wherein the improvement in pest resistance 

2 capability, comprises a decrease in susceptibility of a polypeptide encoded by the pest ' 

3 resistance gene to protease cleavage or to high or low pH levels: 

1 11. The method of claim 1, wherein the improvement in pest resistance 

2 capability comprises a decrease, in toxicity to a host plant of a polypeptide encoded by the 

3 pest resistance gene. . • ' 

1 12. ' The method of claim 1, wherein the pest is selected fi om the group 

2 consisting of a nematode, a virus, and a bacterium. 

1 13 . . The method of claim 1, wherein the pest is an insect. 

1 14. The method of claim 13, wherein the insect is a larvae. 



1 

2 



15. The method of claim 13, wherein the plurality of forms of the nucleic 
acid are derived from a gene which encodes a Bacillus toxin. 
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1 6. The method of claim 1 5, wherein the Bacillus is Bacillus ihuringiensis, 

1 7. The method of claim 1 5, wherein the Bacillus thnringiemis toxin is an 

5-endotoxin. 



18. The method of claim 1, wherein the plurality of forms of the nucleic 
acid comprise segments derived from one or more genes that encode a protease inhibitor, a 
polyphenol oxidase, an insecticidal protease, a vegetative insecticidal protein, a lectin, or a 
biosynthetic pathway for an insecticide. 

19. The method of claim 18, wherein the gene encodes a vegetative 
insecticidal protein of a Bacillus species. 

20. The method of claim 19, wherein Bacillus species is selected from 
the group consisting o^B. cereus^ B. popilliae, B. spheracus, and B, thuringiensis. - , 

21. The method of claim ] 8, wherein the pest resistance gene encodes a^.- 
cholesterol oxidase. 

22. A library which comprises a plurality of recombinant pest resistance 
genes, wherein each recombinant pest resistance gene contains different permutations of v 
segments of the gene which can confer upon a plant resistance to the pest. 

23. The library of claim 22, wherein the library comprises a plurality of 
recombinant, pest resistance genes which have been screened for ability to confer upon a 
plant improved pest resistance capability compared to a non-recombinant.pest resistance 
gene. , 

24. The library of claim 23, wherein the library is a phage display library. 

25. The library of claim 24, wherein the screening is performed by 
identifying library members comprising a recombinant pest resistance gene which encode a 
polypeptide-having-enhanced-binding-to a-receptor-for-the-polypeptide; 
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1 - 26. The library of claim 24, wherein the screening is performed by 

2 identifying library members comprising a recombinant pest resistance gene which encode a 

3 polypeptide having enhanced binding to an insect midgut. 

1 27. The library of claim 26, wherein the midgut is inverted. 

,1 ' 28. The library of claim 24, wherein the screening is performed by 

2 subjecting the phage to consumption by insects, and amplifying DNA obtained from insects 

3 which die using as primers a pair of oligonucleotides which hybridize to an expression 

4 vector which comprises the recombinant pest resistance gene. 

1 29. The library of claim'23, wherein the library is screened by contacting 

2 insect cells with library members and identifying those library members that are toxic to the 

3 insect cells. 

1 30. The library of claim 22, wherein the !ibrar\- is prepared by a- method 

2 comprising: 

3 (1) recombining a plurality of forms of anucleic acid derived from a 



4 gene which can confer upon a plant resistance to a^pest, wherein the plurality of forms of the 

5 nucleic acid differ from each other in two or more nucleotides, to produce a library of 

6 recombinant pest resistance genes; and 

7 (2) screening the librar>- to identify at least one optimized recombinant 

8 pest resistance gene that exhibits improved pest resistance capability compared to a non- 

9 recombinant pest resistance gene. 



\ '31. A method of obtaining an organism that is pathogenic to a plant pest, 

2 the method comprising: 

3 (1) recombining a plurality of forms of a genomic nucleic acid derived 

4 from a plurality of isolates of the organism, wherein the plurality of forms of the genomic 

5 nucleic acid differ from each other in two or more nucleotides, to produce a library of 

6 recombinant genomes; 

7 (2) introducing the library of genomes into the plant pest; and 
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(3) identifying at least one optimized recombinant genome that 
exhibits improved pathogenic activity against the pest compared to a non-recombinant 
pathogen genomic nucleic acid. 

32. The method of claim 31, wherein the organism is a virus. 



33. The method of claim 32, wherein the virus is a baculovirus and the 
organism is an insect. 



PCT/US99/08473 



1/5. 

POOL OF RELATED SEQUENCES 

X XX X 

X- — 

^ >c-xx X — — 

X X X >i - 

_ — X X X 



PANEL A 



RANDOM 
FRAGMENTATION 



X 




XX X 


XX — 


X 




K XX 










X 




y X 





PANEL B 



REASSEMBLE 
FRAGMENTS 



•X- 



y. .Y. 



■X-X 



XX X 



PANEL C 



LIBRARY OF 
RECOMBINANTS 



X 


X. ..- 


V 

X 




X- 

-X— 


-X- 






-X— — ^ 












X 










X ,\ 




-Xr- 








-X- 






-Xr- 






X — 


X 


— X- 








y 








-X- 




X 






X 




X 


—X — 


X X — --X- 




V 





PANEL D 



SELECT BEST 
RECOMBINANTS 



XX K XX X X X-X K XX X 



/ 

SUBSTITUTE SHEET (RULE 26) 



99/57128 



2/5 



PCT/US99/08473 



Q 
LlI 
O 

2: 

UJ 

a 

CO 

CO 
UJ 

z 

UJ 
CD 

o 

o 
g 

A 



-c 



-CytA(X03182) 
-VIB (L07024) 
■ VIA {L07022} 

■CytB(Z14147) 
-IIA {M31738) 
-IIB-{M23724)— 

■lie {X57252) 
IVD {M31737) 

IIIB(X17123) 

IIIB(b)(M89794) 

IIIA(M22472} 

— HID {X59797) 

— IE (X53985) ■ 

— — IE(b)(M73253J 
iC (X07518) 

- — • lC(b)(M97880) 

ID (X54160) 

^ IA{b) (M13898) 

J~^— (!AiD))(M65252) ■ 
lA(a)(M11250) 

IA(d)(M73250) 
IA(c)(M11068) 

iF(X63897} 
(V)(X62821) 

IB(X06711) ' 
IIIC.(M64478) 
I6(X58120) r 

IH (237527) 
(Buibui)(UD4369 
J.VAJY00423) . 
•IVC (M12622) 
-IVB(X07423) 
•(VA(a)) (L07025) 

-{VA(b))(L07026) 
-(VB)(L07027) 
-IVC)(L07023) 
-(1X){X75019) 



40 60 

APR SIMILARITY 



80 

SCORE (%) 



100 



SUBSTITUTE SHEET (RULE 26) 



wo 99/57128 



PCT/US99/08473 



FIG. 3. 



PRIMARY 




Cry t6Aa 
Cry17Aa 
CrySAo 
Cry 5 Ac 
Cry5Ab 

CryZlAa 
Cry13Aa 
Cryl4Aa 
CryZAa 
CryZAb 
CryZAc 
CrylSAa 
CryllAo 
CryllBa 
CytlAa 
CytlAb 

CytaBa 
CytZBb 
CytZAa 
C?y6Aa 
Cry6Ba 
CryZZAo 
CrylSAa 



SUBSTITUTE SHEET (RULE 26) 



PCT/US99/08473 

4/5 



Cry3ba MYCOGEN 
Cry3bb ECOGEN 
Cry3ac ABBOTT 
Cry3cc PGS 



{Z 




Cry lia ZENECA 
Cry lib KOREA 
Crylbc CAMBRIDGE 

Cryibb ECOGEN 

Crylba U WASHINGTON 

CrylKo KOREA 
, Cry7aa MYCOGEN 

CryTab MYCOGEN 

CrySaa MYCOGEN 

CrySbc MYCOGEN 

CryScc MYCOGEN/KUBOTA 

CrySaa HOKKAIDO U 

Cry2cc ECOGEN 

Cry2cb ECOGEN 

Cry2cc PURDUE U. 
Cryieao GERMANY 
Cryl4cc MYCOGEN 



70 80 90 IDENTITY 



FIG, 4. 



SUBSTITUTE SHEET (RULE 26) 



wo 99/57128 



5/5 



PCT/US99/08473 



■ A.RHIZOGENES 

r ' ~ "~ — : — — — \ 




F/a 5. 



SUBSTITUTE SHEET (RULE 26) 



INTERNATIONAL SEARCH REPORT 



International application No. 
PCT/US99/08473 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC(6) :C07H 21/04; C12N 5/14 

US CL :536/23.1. 24.3; 435/419 
According to International Patent Classificauon (IPC) or to both national classification and IPC 



FIELDS SEARCHED 



Minimum documentation searched (classification system followed by classification symbols) 
U.S. : 536/23, L 24.3; 435/419 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 
NONE . • ' 



Electronic data base consulted during the international search (name of data base and, where practicable, search terms used) 
APS; STN/CAS/DERWENT; MEDLINE 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category' 



X.P 
Y 

A.P 
A 



Citation of document, .with indication, where appropriate, of the relevant passages 



US 5,874,288 A (THOMPSON et al) 23 Februar>' 1999, see entire 
reference. 

US 5,640,804 A (DRIVER et al) 24 June 1997, entire reference. 

US 5,882,851 A (KOCH et al) 16 March 1999, entire reference. 

US A 5,659,123 A (VAN RIE et al) 19 August 1997, entire 
reference. 



Relevant to claim No. 



1-3,8,11-17,19-20 

1-3, 6-21 

1-33 

1-33 



[ I Further document* are listed ia the continuation ot Box C. | | Sec patent family annex. 



' 3p«etal cstagoriM of oitod <locura«nU; 

A* document defuiing iho general ttnte of the ut which ii not coniidered 

to bo of porticutiJ relevance 

B' sarlief document publUhed on or afier the intamationil fiUng date 

L* document vhicb may throw doubti on priority cJain](i) or which ii 

ciled to eatabUih the publication date of another citation or other 
ipectol reason (ae epecined) 

O* document referring to on oral diicJoeuie, use, exhibition or other 

means 

P* dooument published prior to the iotemational filing date but later than 
the priority date claimed - 



later dooximent published after tb« inumational Tding data or priority 
data and not in cooflkt with the eppltcatioa but cited to UDderatatkd 
the principle or theory underlying the invention 

document of ptuticular relevance; the claimed inveniioo cannot bo 
considered novo) or cannot be considered to involve an inventive step 
when the document is taken alooe 

document of particular relevance; (he claimed invention cannot be 
considered to involve on inventive step when the document is 
combined with one or more other such documents, such combtDation 
being obvious to a penon ilciUed in the art 

document member of the same patent family 



Date of the actual completion of the interaational search 



21 JUNE 1999 



Name and -mail in g- address of- the- IS A/US- 
Commissioner of Patents and TradcmaHcs 
Box PCT 

Washington. D.C. 20231 
Facsimile No. (703) 305-3230 



Date of mailing of the international search report 

19 AUG 1999 




Telephone No, (703) 308-0196 



Form PCT/ISA/210 (second shcetXJuIy 1992)* 



